Static Provisioned Environments for Specialized Workloads: GPU and CPU-Intensive Tasks

Dove-WingDove-Wing
12 min read

Introduction

Modern computational workloads often require specialized resources, particularly for machine learning, scientific computing, and data processing tasks. While Kubernetes offers solutions for GPU and CPU-intensive workloads, its overhead can be significant. This article demonstrates how to create isolated environments specifically optimized for GPU-accelerated applications and CPU-intensive tasks using Linux's native isolation capabilities.

System Architecture Overview

Our approach creates two distinct isolated environments:

  1. GPU Partition: For machine learning, rendering, or other GPU-accelerated workloads

  2. CPU-Intensive Partition: For multi-threaded computational tasks that benefit from dedicated CPU resources

Each environment will have:

  • Resource isolation via namespaces and cgroups

  • Optimized libraries and tooling for their specific workload type

  • Direct hardware access where required

  • Performance monitoring capabilities

Base System Preparation

Start with a comprehensive initialization script:

#!/bin/bash
# /opt/specialized-environments/init.sh

# Load kernel modules required for GPU and CPU isolation
modprobe nvidia
modprobe nvidia_uvm
modprobe cpuset

# Enable system settings for optimal performance
echo 1 > /proc/sys/kernel/numa_balancing
echo 1 > /proc/sys/vm/zone_reclaim_mode
echo 1 > /proc/sys/net/ipv4/ip_forward

# Create base directories
mkdir -p /var/lib/environments/{gpu-compute,cpu-compute}
mkdir -p /var/lib/environment-data/{gpu-compute,cpu-compute}
mkdir -p /var/run/environments

# Create network bridge for isolated environments
ip link add name compute-bridge type bridge
ip addr add 10.200.0.1/24 dev compute-bridge
ip link set compute-bridge up

# Setup iptables for outbound connectivity
iptables -t nat -A POSTROUTING -s 10.200.0.0/24 -j MASQUERADE

# Set global CPU performance governor for maximum performance
for governor in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do
    echo performance > $governor
done

# Load environment configurations and run setup
source /etc/specialized-environments/gpu-compute.conf
source /etc/specialized-environments/cpu-compute.conf

# Initialize environments
setup_gpu_environment
setup_cpu_environment

# Start monitoring
systemctl start environment-monitor.service

GPU-Accelerated Environment Setup

The following script creates an isolated environment optimized for GPU workloads:

#!/bin/bash
# Part of /etc/specialized-environments/gpu-compute.conf

setup_gpu_environment() {
    local ENV_NAME="gpu-compute"
    local ENV_ROOT="/var/lib/environments/${ENV_NAME}"
    local DATA_ROOT="/var/lib/environment-data/${ENV_NAME}"

    echo "Setting up ${ENV_NAME} environment..."

    # Create network namespace
    ip netns add ${ENV_NAME}

    # Create veth pair for networking
    ip link add veth-${ENV_NAME} type veth peer name veth0
    ip link set veth-${ENV_NAME} up
    ip link set veth0 netns ${ENV_NAME}

    # Configure networking
    ip addr add 10.200.0.10/24 dev veth-${ENV_NAME}
    ip netns exec ${ENV_NAME} ip addr add 10.200.0.11/24 dev veth0
    ip netns exec ${ENV_NAME} ip link set veth0 up
    ip netns exec ${ENV_NAME} ip link set lo up
    ip netns exec ${ENV_NAME} ip route add default via 10.200.0.10

    # Prepare filesystem structure for GPU environment
    if [ ! -d "${ENV_ROOT}/usr/local/cuda" ]; then
        # Create minimal filesystem with CUDA support
        mkdir -p ${ENV_ROOT}/{bin,sbin,lib,lib64,usr,etc,var,tmp,proc,sys,dev,run,opt}
        mkdir -p ${ENV_ROOT}/usr/{bin,sbin,lib,lib64,local,share}
        mkdir -p ${ENV_ROOT}/usr/local/{cuda,bin,lib64}
        mkdir -p ${ENV_ROOT}/var/{log,tmp}
        mkdir -p ${ENV_ROOT}/opt/ml/{model,input,output}

        # Copy essential binaries
        cp /bin/{bash,ls,mkdir,cp,rm,mount} ${ENV_ROOT}/bin/

        # Copy CUDA libraries and binaries (for NVIDIA GPUs)
        if [ -d "/usr/local/cuda" ]; then
            cp -r /usr/local/cuda/bin/* ${ENV_ROOT}/usr/local/cuda/bin/
            cp -r /usr/local/cuda/lib64/* ${ENV_ROOT}/usr/local/cuda/lib64/
            cp -r /usr/local/cuda/include ${ENV_ROOT}/usr/local/cuda/

            # Create CUDA symbolic links
            ln -s /usr/local/cuda/lib64/libcudart.so ${ENV_ROOT}/usr/lib64/
            ln -s /usr/local/cuda/lib64/libcublas.so ${ENV_ROOT}/usr/lib64/
            ln -s /usr/local/cuda/lib64/libcudnn.so ${ENV_ROOT}/usr/lib64/
        fi

        # Copy NVIDIA driver libraries
        for lib in /usr/lib/x86_64-linux-gnu/libnvidia-*.so*; do
            if [ -f "$lib" ]; then
                mkdir -p ${ENV_ROOT}/usr/lib/x86_64-linux-gnu/
                cp $lib ${ENV_ROOT}/usr/lib/x86_64-linux-gnu/
            fi
        done

        # Copy required Python with ML libraries (if using PyTorch/TensorFlow)
        if [ -d "/usr/local/lib/python3.9" ]; then
            mkdir -p ${ENV_ROOT}/usr/local/lib/python3.9
            cp -r /usr/local/lib/python3.9/dist-packages ${ENV_ROOT}/usr/local/lib/python3.9/
            cp /usr/local/bin/python3.9 ${ENV_ROOT}/usr/local/bin/
            ln -s /usr/local/bin/python3.9 ${ENV_ROOT}/usr/local/bin/python
        fi

        # Copy dependencies for typical ML frameworks
        for lib in $(find /usr/lib /lib -name "libgomp*.so*" -o -name "libnuma*.so*" -o -name "libcudnn*.so*" 2>/dev/null); do
            if [ -f "$lib" ]; then
                mkdir -p ${ENV_ROOT}/$(dirname $lib)
                cp $lib ${ENV_ROOT}/$lib
            fi
        done

        # Create configuration for GPU environment
        cat > ${ENV_ROOT}/etc/environment <<EOF
PATH=/usr/local/cuda/bin:/usr/local/bin:/usr/bin:/bin
LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/lib64:/usr/lib:/lib
CUDA_VISIBLE_DEVICES=0
NVIDIA_VISIBLE_DEVICES=all
NVIDIA_DRIVER_CAPABILITIES=compute,utility
EOF

        # Create sample GPU test script
        cat > ${ENV_ROOT}/opt/gpu-test.py <<EOF
#!/usr/bin/env python
import torch
print("CUDA Available:", torch.cuda.is_available())
if torch.cuda.is_available():
    print("CUDA Device Count:", torch.cuda.device_count())
    print("CUDA Device Name:", torch.cuda.get_device_name(0))
    # Simple tensor operation on GPU
    x = torch.rand(5, 3).cuda()
    print("GPU Tensor:", x)
EOF
        chmod +x ${ENV_ROOT}/opt/gpu-test.py
    fi

    # Prepare persistent data directories
    mkdir -p ${DATA_ROOT}/{logs,models,datasets}

    # Setup resource limits with cgroups
    mkdir -p /sys/fs/cgroup/cpu/${ENV_NAME}
    mkdir -p /sys/fs/cgroup/memory/${ENV_NAME}

    # Limit to 90% of system memory but prioritize GPU workloads
    TOTAL_MEM=$(free -b | grep Mem | awk '{print $2}')
    GPU_MEM_LIMIT=$(echo "$TOTAL_MEM * 0.9" | bc | cut -d. -f1)

    echo $GPU_MEM_LIMIT > /sys/fs/cgroup/memory/${ENV_NAME}/memory.limit_in_bytes
    echo 800000 > /sys/fs/cgroup/cpu/${ENV_NAME}/cpu.cfs_quota_us  # 800% = 8 cores
    echo 100000 > /sys/fs/cgroup/cpu/${ENV_NAME}/cpu.cfs_period_us

    # Create bind mounts for persistent data
    mount --bind ${DATA_ROOT}/logs ${ENV_ROOT}/var/log
    mount --bind ${DATA_ROOT}/models ${ENV_ROOT}/opt/ml/model
    mount --bind ${DATA_ROOT}/datasets ${ENV_ROOT}/opt/ml/input

    # Mount required special filesystems
    mount -t proc proc ${ENV_ROOT}/proc
    mount -t sysfs sysfs ${ENV_ROOT}/sys

    # Special handling for GPU devices
    # Create GPU device nodes in the environment
    mkdir -p ${ENV_ROOT}/dev/nvidia

    # Find and create all NVIDIA device nodes
    for i in $(find /dev -name "nvidia*" -type c); do
        DEVNAME=$(basename $i)
        MAJOR=$(stat -c "%t" $i | sed 's/^0*//' | tr -d '\n')
        MINOR=$(stat -c "%T" $i | sed 's/^0*//' | tr -d '\n')
        if [ -z "$MAJOR" ]; then MAJOR="0"; fi
        if [ -z "$MINOR" ]; then MINOR="0"; fi
        MAJOR=$(printf "%d" 0x$MAJOR)
        MINOR=$(printf "%d" 0x$MINOR)

        mknod ${ENV_ROOT}/dev/$DEVNAME c $MAJOR $MINOR
        chmod 666 ${ENV_ROOT}/dev/$DEVNAME
    done

    # Start GPU environment
    systemd-run --unit=${ENV_NAME} --slice=specialized \
        --property=CPUQuota=800% \
        --property=IOWeight=800 \
        --property=ExecStart="/opt/specialized-environments/run-isolated.sh ${ENV_NAME} /bin/bash -c 'source /etc/environment && python /opt/gpu-test.py && sleep infinity'" \
        --property=Restart=always

    # Port forwarding for services (e.g., Jupyter or ML server)
    iptables -t nat -A PREROUTING -p tcp --dport 8888 -j DNAT --to-destination 10.200.0.11:8888

    echo "${ENV_NAME} environment setup complete"
}

CPU-Intensive Environment Setup

The following script creates an isolated environment optimized for CPU-intensive tasks:

#!/bin/bash
# Part of /etc/specialized-environments/cpu-compute.conf

setup_cpu_environment() {
    local ENV_NAME="cpu-compute"
    local ENV_ROOT="/var/lib/environments/${ENV_NAME}"
    local DATA_ROOT="/var/lib/environment-data/${ENV_NAME}"

    echo "Setting up ${ENV_NAME} environment..."

    # Create network namespace
    ip netns add ${ENV_NAME}

    # Create veth pair for networking
    ip link add veth-${ENV_NAME} type veth peer name veth0
    ip link set veth-${ENV_NAME} up
    ip link set veth0 netns ${ENV_NAME}

    # Configure networking
    ip addr add 10.200.0.20/24 dev veth-${ENV_NAME}
    ip netns exec ${ENV_NAME} ip addr add 10.200.0.21/24 dev veth0
    ip netns exec ${ENV_NAME} ip link set veth0 up
    ip netns exec ${ENV_NAME} ip link set lo up
    ip netns exec ${ENV_NAME} ip route add default via 10.200.0.20

    # Prepare filesystem structure for CPU-intensive environment
    if [ ! -d "${ENV_ROOT}/usr/local/bin" ]; then
        # Create minimal filesystem with CPU optimization tools
        mkdir -p ${ENV_ROOT}/{bin,sbin,lib,lib64,usr,etc,var,tmp,proc,sys,dev,run,opt}
        mkdir -p ${ENV_ROOT}/usr/{bin,sbin,lib,lib64,local,share}
        mkdir -p ${ENV_ROOT}/usr/local/{bin,lib64,include}
        mkdir -p ${ENV_ROOT}/var/{log,tmp}
        mkdir -p ${ENV_ROOT}/opt/data/{input,output}

        # Copy essential binaries
        cp /bin/{bash,ls,mkdir,cp,rm,mount} ${ENV_ROOT}/bin/

        # Copy CPU optimization libraries (OpenMP, MPI, etc.)
        for lib in $(find /usr/lib /lib -name "libgomp*.so*" -o -name "libopenmpi*.so*" -o -name "libmpi*.so*" -o -name "libomp*.so*" -o -name "libnuma*.so*" 2>/dev/null); do
            if [ -f "$lib" ]; then
                mkdir -p ${ENV_ROOT}/$(dirname $lib)
                cp $lib ${ENV_ROOT}/$lib
            fi
        done

        # Copy high-performance computing tools
        if [ -f "/usr/bin/gcc" ]; then cp /usr/bin/gcc ${ENV_ROOT}/usr/bin/; fi
        if [ -f "/usr/bin/g++" ]; then cp /usr/bin/g++ ${ENV_ROOT}/usr/bin/; fi
        if [ -f "/usr/bin/make" ]; then cp /usr/bin/make ${ENV_ROOT}/usr/bin/; fi
        if [ -f "/usr/bin/cmake" ]; then cp /usr/bin/cmake ${ENV_ROOT}/usr/bin/; fi

        # Copy OpenBLAS/LAPACK for scientific computing
        for lib in $(find /usr/lib /lib -name "libblas*.so*" -o -name "liblapack*.so*" -o -name "libopenblas*.so*" 2>/dev/null); do
            if [ -f "$lib" ]; then
                mkdir -p ${ENV_ROOT}/$(dirname $lib)
                cp $lib ${ENV_ROOT}/$lib
            fi
        done

        # Create configuration for CPU optimization
        cat > ${ENV_ROOT}/etc/environment <<EOF
PATH=/usr/local/bin:/usr/bin:/bin
LD_LIBRARY_PATH=/usr/local/lib64:/usr/lib64:/usr/lib:/lib
OMP_NUM_THREADS=16
MKL_NUM_THREADS=16
OPENBLAS_NUM_THREADS=16
VECLIB_MAXIMUM_THREADS=16
NUMEXPR_NUM_THREADS=16
EOF

        # Create sample CPU test script
        cat > ${ENV_ROOT}/opt/cpu-test.c <<EOF
#include <stdio.h>
#include <stdlib.h>
#include <omp.h>
#include <time.h>

#define SIZE 2000
#define ITERATIONS 5

void matrix_multiply(double *A, double *B, double *C, int n) {
    #pragma omp parallel for
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < n; j++) {
            double sum = 0.0;
            for (int k = 0; k < n; k++) {
                sum += A[i*n+k] * B[k*n+j];
            }
            C[i*n+j] = sum;
        }
    }
}

int main() {
    double *A = (double*)malloc(SIZE*SIZE*sizeof(double));
    double *B = (double*)malloc(SIZE*SIZE*sizeof(double));
    double *C = (double*)malloc(SIZE*SIZE*sizeof(double));

    // Initialize matrices
    for (int i = 0; i < SIZE*SIZE; i++) {
        A[i] = (double)rand() / RAND_MAX;
        B[i] = (double)rand() / RAND_MAX;
    }

    printf("Running with %d OpenMP threads\n", omp_get_max_threads());

    clock_t start = clock();

    // Perform matrix multiplications
    for (int i = 0; i < ITERATIONS; i++) {
        matrix_multiply(A, B, C, SIZE);
    }

    clock_t end = clock();
    double time_spent = (double)(end - start) / CLOCKS_PER_SEC;

    printf("Completed %d iterations of %dx%d matrix multiplication in %.2f seconds\n", 
           ITERATIONS, SIZE, SIZE, time_spent);

    free(A);
    free(B);
    free(C);
    return 0;
}
EOF

        # Compile the CPU test
        cat > ${ENV_ROOT}/opt/compile-test.sh <<EOF
#!/bin/bash
cd /opt
gcc -fopenmp -O3 cpu-test.c -o cpu-benchmark
EOF
        chmod +x ${ENV_ROOT}/opt/compile-test.sh
    fi

    # Prepare persistent data directories
    mkdir -p ${DATA_ROOT}/{logs,data,results}

    # Setup resource limits with cgroups
    mkdir -p /sys/fs/cgroup/cpu/${ENV_NAME}
    mkdir -p /sys/fs/cgroup/memory/${ENV_NAME}
    mkdir -p /sys/fs/cgroup/cpuset/${ENV_NAME}

    # Get CPU count for dedicated cores
    CPU_COUNT=$(nproc)
    DEDICATED_CORES=$(($CPU_COUNT - 2))  # Reserve 2 cores for system

    # Allocate specific CPU cores (e.g., cores 2-15 if you have 16 cores)
    echo "2-$DEDICATED_CORES" > /sys/fs/cgroup/cpuset/${ENV_NAME}/cpuset.cpus
    echo "0-1" > /sys/fs/cgroup/cpuset/${ENV_NAME}/cpuset.mems

    # Set memory limits
    TOTAL_MEM=$(free -b | grep Mem | awk '{print $2}')
    CPU_MEM_LIMIT=$(echo "$TOTAL_MEM * 0.7" | bc | cut -d. -f1)  # 70% of system memory
    echo $CPU_MEM_LIMIT > /sys/fs/cgroup/memory/${ENV_NAME}/memory.limit_in_bytes

    # Set CPU scheduling priority
    echo 1500000 > /sys/fs/cgroup/cpu/${ENV_NAME}/cpu.cfs_quota_us  # 1500% = 15 cores
    echo 100000 > /sys/fs/cgroup/cpu/${ENV_NAME}/cpu.cfs_period_us

    # Create bind mounts for persistent data
    mount --bind ${DATA_ROOT}/logs ${ENV_ROOT}/var/log
    mount --bind ${DATA_ROOT}/data ${ENV_ROOT}/opt/data/input
    mount --bind ${DATA_ROOT}/results ${ENV_ROOT}/opt/data/output

    # Mount required special filesystems
    mount -t proc proc ${ENV_ROOT}/proc
    mount -t sysfs sysfs ${ENV_ROOT}/sys

    # Start CPU environment
    systemd-run --unit=${ENV_NAME} --slice=specialized \
        --property=CPUQuota=1500% \
        --property=CPUAffinity=2-${DEDICATED_CORES} \
        --property=IOWeight=900 \
        --property=ExecStart="/opt/specialized-environments/run-isolated.sh ${ENV_NAME} /bin/bash -c 'source /etc/environment && /opt/compile-test.sh && /opt/cpu-benchmark && sleep infinity'" \
        --property=Restart=always

    # Port forwarding for services
    iptables -t nat -A PREROUTING -p tcp --dport 9090 -j DNAT --to-destination 10.200.0.21:9090

    echo "${ENV_NAME} environment setup complete"
}

Execution Script for Isolated Environments

This script handles the namespace isolation for both environments:

#!/bin/bash
# /opt/specialized-environments/run-isolated.sh

ENV_NAME="$1"
shift
COMMAND="$@"

ENV_ROOT="/var/lib/environments/${ENV_NAME}"

# Enter network namespace first
ip netns exec ${ENV_NAME} unshare --mount --uts --ipc --pid --fork \
    chroot ${ENV_ROOT} /bin/bash -c "mount -t proc proc /proc && mount -t sysfs sysfs /sys && exec ${COMMAND}"

Performance Monitoring System

Create a monitoring script to track resource utilization and performance:

#!/bin/bash
# /opt/specialized-environments/monitor.sh

# Configuration
CHECK_INTERVAL=30
LOG_DIR="/var/log/specialized-environments"
mkdir -p ${LOG_DIR}

# GPU monitoring function
monitor_gpu_environment() {
    # Log timestamp
    TIMESTAMP=$(date +"%Y-%m-%d %H:%M:%S")

    # Get GPU stats using nvidia-smi
    if command -v nvidia-smi &> /dev/null; then
        GPU_UTIL=$(nvidia-smi --query-gpu=utilization.gpu --format=csv,noheader,nounits)
        GPU_MEM=$(nvidia-smi --query-gpu=memory.used --format=csv,noheader)
        GPU_TEMP=$(nvidia-smi --query-gpu=temperature.gpu --format=csv,noheader)

        echo "$TIMESTAMP,GPU,$GPU_UTIL%,$GPU_MEM,$GPU_TEMP°C" >> ${LOG_DIR}/gpu-metrics.csv
    fi

    # Check if environment is running
    systemctl is-active --quiet gpu-compute
    if [ $? -ne 0 ]; then
        echo "$TIMESTAMP: GPU environment not running, restarting..." >> ${LOG_DIR}/gpu-events.log
        systemctl restart gpu-compute
    fi
}

# CPU monitoring function
monitor_cpu_environment() {
    # Log timestamp
    TIMESTAMP=$(date +"%Y-%m-%d %H:%M:%S")

    # Get CPU stats for the dedicated cores
    CPU_UTIL=$(top -b -n1 | grep "Cpu(s)" | awk '{print $2 + $4}')
    MEM_UTIL=$(free | grep Mem | awk '{print $3/$2 * 100.0}')

    echo "$TIMESTAMP,CPU,$CPU_UTIL%,$MEM_UTIL%" >> ${LOG_DIR}/cpu-metrics.csv

    # Check if environment is running
    systemctl is-active --quiet cpu-compute
    if [ $? -ne 0 ]; then
        echo "$TIMESTAMP: CPU environment not running, restarting..." >> ${LOG_DIR}/cpu-events.log
        systemctl restart cpu-compute
    fi
}

# Create header for CSV files if they don't exist
if [ ! -f ${LOG_DIR}/gpu-metrics.csv ]; then
    echo "Timestamp,Type,Utilization,Memory,Temperature" > ${LOG_DIR}/gpu-metrics.csv
fi

if [ ! -f ${LOG_DIR}/cpu-metrics.csv ]; then
    echo "Timestamp,Type,CPU_Utilization,Memory_Utilization" > ${LOG_DIR}/cpu-metrics.csv
fi

# Main monitoring loop
while true; do
    monitor_gpu_environment
    monitor_cpu_environment
    sleep ${CHECK_INTERVAL}
done

Create a systemd service for the monitoring system:

# /etc/systemd/system/environment-monitor.service
[Unit]
Description=Specialized Environment Monitoring
After=network.target

[Service]
Type=simple
ExecStart=/opt/specialized-environments/monitor.sh
Restart=always

[Install]
WantedBy=multi-user.target

Boot-time Integration

Ensure the environments start at boot time:

# /etc/systemd/system/specialized-environments.service
[Unit]
Description=Specialized Computing Environments
After=network.target
Before=gpu-compute.service cpu-compute.service

[Service]
Type=oneshot
ExecStart=/opt/specialized-environments/init.sh
RemainAfterExit=true
ExecStop=/opt/specialized-environments/cleanup.sh

[Install]
WantedBy=multi-user.target

Environment Cleanup Script

For proper teardown of the environments:

#!/bin/bash
# /opt/specialized-environments/cleanup.sh

# Stop services
systemctl stop gpu-compute cpu-compute

# Unmount filesystems
for ENV_NAME in gpu-compute cpu-compute; do
    ENV_ROOT="/var/lib/environments/${ENV_NAME}"

    umount ${ENV_ROOT}/proc
    umount ${ENV_ROOT}/sys

    # Unmount environment-specific mounts
    if [ "${ENV_NAME}" == "gpu-compute" ]; then
        umount ${ENV_ROOT}/var/log
        umount ${ENV_ROOT}/opt/ml/model
        umount ${ENV_ROOT}/opt/ml/input
    else
        umount ${ENV_ROOT}/var/log
        umount ${ENV_ROOT}/opt/data/input
        umount ${ENV_ROOT}/opt/data/output
    fi
done

# Remove network namespaces
ip netns del gpu-compute
ip netns del cpu-compute

# Remove veth interfaces
ip link del veth-gpu-compute
ip link del veth-cpu-compute

# Remove bridge
ip link set compute-bridge down
ip link del compute-bridge

# Clean up iptables rules
iptables -t nat -F

# Reset CPU governor to balanced
for governor in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do
    echo ondemand > $governor
done

Submitting Jobs to Specialized Environments

Create scripts to easily submit jobs to each environment:

GPU Job Submission

#!/bin/bash
# /usr/local/bin/submit-gpu-job

if [ $# -lt 1 ]; then
    echo "Usage: $0 <script.py> [args...]"
    exit 1
fi

SCRIPT="$1"
shift
ARGS="$@"
SCRIPT_NAME=$(basename "$SCRIPT")

# Copy script to GPU environment
cp "$SCRIPT" /var/lib/environment-data/gpu-compute/models/

# Run the script in the GPU environment
systemd-run --unit=gpu-job-$(date +%s) --slice=specialized \
    --property=CPUQuota=800% \
    /opt/specialized-environments/run-isolated.sh gpu-compute \
    /bin/bash -c "source /etc/environment && cd /opt/ml/model && python ${SCRIPT_NAME} ${ARGS} > /var/log/job-$(date +%s).log 2>&1"

echo "Job submitted to GPU environment"

CPU Job Submission

#!/bin/bash
# /usr/local/bin/submit-cpu-job

if [ $# -lt 1 ]; then
    echo "Usage: $0 <executable> [args...]"
    exit 1
fi

EXEC="$1"
shift
ARGS="$@"
EXEC_NAME=$(basename "$EXEC")

# Copy executable to CPU environment
cp "$EXEC" /var/lib/environment-data/cpu-compute/data/

# Run the executable in the CPU environment
systemd-run --unit=cpu-job-$(date +%s) --slice=specialized \
    --property=CPUQuota=1500% \
    --property=CPUAffinity=2-15 \
    /opt/specialized-environments/run-isolated.sh cpu-compute \
    /bin/bash -c "source /etc/environment && cd /opt/data/input && ./${EXEC_NAME} ${ARGS} > /var/log/job-$(date +%s).log 2>&1"

echo "Job submitted to CPU environment"

Resource Efficiency Compared to Kubernetes

The static provisioning approach provides several advantages over Kubernetes for specialized workloads:

  1. Direct hardware access: GPU passthrough is simpler with direct namespace isolation

  2. Reduced context switching: Dedicated CPU pinning eliminates scheduling overhead

  3. Lower memory overhead: No container runtime or orchestration overhead

  4. Optimized libraries: Environment contains only the necessary libraries for each workload type

  5. Streamlined I/O paths: Direct device access without abstraction layers

Security Benefits

  1. Reduced attack surface: No container runtime exploits or Kubernetes API vulnerabilities

  2. Isolated device access: Direct control over which devices are exposed to each environment

  3. Simplified privilege model: No complex RBAC or container security contexts

  4. Resource boundaries: Hard cgroup limits prevent resource starvation between environments

Practical Use Cases

This approach is particularly well-suited for:

  1. Machine learning training: GPU-optimized environment for frameworks like TensorFlow/PyTorch

  2. Scientific computing: CPU-optimized environment for simulations and data analysis

  3. Rendering farms: Predictable GPU resource allocation for graphics workloads

  4. High-performance databases: CPU-isolated environment for database operations

  5. Signal processing: Real-time processing with dedicated CPU resources

Conclusion

For specialized GPU and CPU-intensive workloads, this static provisioning approach offers significant advantages in terms of performance, resource utilization, and simplicity compared to container orchestration platforms. By leveraging Linux's native isolation capabilities and optimizing each environment for its specific computational needs, this solution provides a lightweight yet powerful alternative for organizations that need to maximize the performance of specialized hardware resources without the overhead of container orchestration.

While this approach requires more initial setup and customization than deploying containers on Kubernetes, it offers greater control and efficiency for stable, long-running specialized workloads where direct hardware access and performance are critical factors.

0
Subscribe to my newsletter

Read articles from Dove-Wing directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Dove-Wing
Dove-Wing