Mastering Kubernetes Security: A Deep Dive into SecurityContext

Omkar ShelkeOmkar Shelke
14 min read

🛡️ 1. Introduction to SecurityContext in Kubernetes

A SecurityContext in Kubernetes defines privilege and access control settings for pods or containers, allowing you to control how processes run, access resources, and interact with the system. It is a critical component for securing Kubernetes workloads by enforcing least-privilege principles.

  • Pod-Level SecurityContext 🗂️: Applies security settings to all containers in a pod and can affect the pod’s volumes. It’s defined under spec.securityContext.

  • Container-Level SecurityContext 🖥️: Applies to a specific container and can override pod-level settings for that container. It’s defined under spec.containers[].securityContext.

The key difference is scope:

  • Pod-level settings provide a baseline for all containers and volumes in the pod.

  • Container-level settings allow fine-grained customization for individual containers, overriding pod-level settings where applicable.


📂 2. Pod-Level SecurityContext

The pod-level securityContext is defined in the pod’s spec and applies to all containers in the pod unless overridden by a container-level securityContext. It also applies to certain volume-related settings (e.g., fsGroup and seLinuxOptions).

🛠️ Fields in Pod-Level SecurityContext

Here’s a comprehensive list of fields available at the pod level, their purpose, and examples:

  1. runAsUser 👤:

    • Purpose: Specifies the user ID (UID) for all containers’ processes in the pod.

    • Use Case: Ensures containers don’t run as root, reducing the risk of privilege escalation.

    • Example: A web server pod where all containers should run as a non-root user for security.

        apiVersion: v1
        kind: Pod
        metadata:
          name: web-server-pod
        spec:
          securityContext:
            runAsUser: 1000  # All containers run as UID 1000
          containers:
          - name: nginx
            image: nginx
            ports:
            - containerPort: 80
      

      In this example, a web server (e.g., Nginx) runs as UID 1000, preventing root-level access even if the container is compromised.

  2. runAsGroup 👥:

    • Purpose: Sets the primary group ID (GID) for all containers’ processes.

    • Use Case: Controls group ownership for files created by containers, useful for shared volumes.

    • Example: A pod with a shared volume where files need consistent group ownership.

        apiVersion: v1
        kind: Pod
        metadata:
          name: shared-volume-pod
        spec:
          securityContext:
            runAsUser: 1000
            runAsGroup: 3000  # Primary group ID for processes
          volumes:
          - name: shared-data
            emptyDir: {}
          containers:
          - name: app
            image: busybox
            command: ["sh", "-c", "echo hello > /data/testfile && sleep 1h"]
            volumeMounts:
            - name: shared-data
              mountPath: /data
      

      Files created in the /data volume will be owned by GID 3000, ensuring consistent group access.

  3. runAsNonRoot 🔒:

    • Purpose: Ensures all containers run as a non-root user (UID ≠ 0). If set to true, Kubernetes rejects the pod if any container tries to run as root.

    • Use Case: Enforce a policy where no container in the pod can run as root.

    • Example: A corporate policy requires all pods to run non-root for compliance.

        apiVersion: v1
        kind: Pod
        metadata:
          name: non-root-pod
        spec:
          securityContext:
            runAsNonRoot: true  # Enforces non-root user
          containers:
          - name: app
            image: nginx
            ports:
            - containerPort: 80
      

      If the container tries to run as root, the pod will fail to start.

  4. fsGroup 💾:

    • Purpose: Sets the group ID for volume ownership and permissions. Kubernetes applies this GID to volumes that support ownership management (e.g., emptyDir, persistentVolumeClaim).

    • Use Case: Ensures files in a shared volume are accessible by a specific group, such as in a multi-container pod.

    • Example: A pod with a shared volume for a data processing application.

        apiVersion: v1
        kind: Pod
        metadata:
          name: data-processing-pod
        spec:
          securityContext:
            runAsUser: 1000
            fsGroup: 2000  # Volume files owned by GID 2000
          volumes:
          - name: data-vol
            emptyDir: {}
          containers:
          - name: processor
            image: busybox
            command: ["sh", "-c", "echo data > /data/output && sleep 1h"]
            volumeMounts:
            - name: data-vol
              mountPath: /data
      

      Files in /data will be owned by GID 2000, ensuring group-level access control.

  5. supplementalGroups 👨‍👩‍👧‍👦:

    • Purpose: Adds additional group IDs to container processes, beyond the primary runAsGroup.

    • Use Case: Grants access to resources owned by multiple groups, such as shared storage.

    • Example: A pod accessing multiple shared volumes with different group ownerships.

        apiVersion: v1
        kind: Pod
        metadata:
          name: multi-group-pod
        spec:
          securityContext:
            runAsUser: 1000
            runAsGroup: 3000
            supplementalGroups: [4000, 5000]  # Additional group memberships
          containers:
          - name: app
            image: busybox
            command: ["sh", "-c", "sleep 1h"]
      

      Processes in the container belong to GIDs 3000, 4000, and 5000, allowing access to resources owned by these groups.

  6. supplementalGroupsPolicy (Kubernetes v1.33+, beta) 🛠️:

    • Purpose: Controls how supplementary groups are calculated. Options are:

      • Merge: Merges groups from the container image’s /etc/group with fsGroup and supplementalGroups.

      • Strict: Only uses groups specified in fsGroup, supplementalGroups, or runAsGroup, ignoring /etc/group.

    • Use Case: Avoid unintended group memberships from the container image for stricter security.

    • Example: A pod requiring strict group control for compliance.

        apiVersion: v1
        kind: Pod
        metadata:
          name: strict-groups-pod
        spec:
          securityContext:
            runAsUser: 1000
            runAsGroup: 3000
            supplementalGroups: [4000]
            supplementalGroupsPolicy: Strict  # Only specified groups are used
          containers:
          - name: app
            image: busybox
            command: ["sh", "-c", "sleep 1h"]
      

      The container process will only have GIDs 3000 and 4000, ignoring any groups defined in the image’s /etc/group.

  7. fsGroupChangePolicy ⚙️:

    • Purpose: Controls how Kubernetes changes ownership and permissions for volumes. Options are:

      • OnRootMismatch: Only changes permissions if the volume’s root directory doesn’t match the expected fsGroup.

      • Always: Always changes permissions when the volume is mounted.

    • Use Case: Optimize pod startup time for large volumes by reducing unnecessary permission changes.

    • Example: A pod with a large persistent volume.

        apiVersion: v1
        kind: Pod
        metadata:
          name: large-volume-pod
        spec:
          securityContext:
            runAsUser: 1000
            fsGroup: 2000
            fsGroupChangePolicy: OnRootMismatch  # Optimize permission changes
          volumes:
          - name: data
            persistentVolumeClaim:
              claimName: data-pvc
          containers:
          - name: app
            image: busybox
            volumeMounts:
            - name: data
              mountPath: /data
      

      This reduces startup time by only changing permissions when necessary.

  8. seLinuxOptions 🔐:

    • Purpose: Assigns SELinux labels to containers and volumes for access control.

    • Use Case: Enforce mandatory access control in environments with SELinux enabled (e.g., Red Hat systems).

    • Example: A pod running in an SELinux-enabled cluster.

        apiVersion: v1
        kind: Pod
        metadata:
          name: selinux-pod
        spec:
          securityContext:
            seLinuxOptions:
              level: "s0:c123,c456"  # SELinux label for processes and volumes
          containers:
          - name: app
            image: busybox
            command: ["sh", "-c", "sleep 1h"]
      

      All containers and volumes use the specified SELinux label, ensuring compliance with SELinux policies.

  9. seLinuxChangePolicy (Kubernetes v1.33+, beta) 🔄:

    • Purpose: Controls SELinux relabelling behaviour. Options are:

      • MountOption: Uses mount options for faster relabelling (requires SELinuxMount feature gate).

      • Recursive: Recursively relabels all files in the volume.

    • Use Case: Optimize SELinux relabelling for performance or allow multiple pods with different labels to share a volume.

    • Example: A pod opting out of mount-based relabelling for compatibility.

        apiVersion: v1
        kind: Pod
        metadata:
          name: selinux-recursive-pod
        spec:
          securityContext:
            seLinuxOptions:
              level: "s0:c123,c456"
            seLinuxChangePolicy: Recursive  # Recursive relabeling
          containers:
          - name: app
            image: busybox
            command: ["sh", "-c", "sleep 1h"]
      

      This ensures recursive relabeling, allowing multiple pods with different SELinux labels to share a volume.

  10. procMount (Kubernetes v1.33+, beta) 🖥️:

    • Purpose: Controls the /proc filesystem’s mount behaviour. Options are:

      • Default: Masks certain /proc paths (e.g., /proc/kcore) and makes others read-only.

      • Unmasked: Exposes all /proc paths, useful for nested container runtimes.

    • Use Case: Running containers within containers (e.g., Docker-in-Docker).

    • Example: A pod running a CI/CD pipeline with nested containers.

        apiVersion: v1
        kind: Pod
        metadata:
          name: dind-pod
        spec:
          securityContext:
            procMount: Unmasked  # Expose full /proc
          hostUsers: false  # Required for Unmasked
          containers:
          - name: docker
            image: docker:dind
            command: ["dockerd"]
      

      This allows the Docker daemon to access the full /proc filesystem for container management.

🌐 Real-Life Example for Pod-Level SecurityContext

Scenario: A company runs a microservices application with multiple pods, each containing multiple containers (e.g., an app and a logging sidecar). To comply with security policies, all containers must run as non-root, and shared volumes must be accessible by a specific group.

apiVersion: v1
kind: Pod
metadata:
  name: microservice-pod
spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
    runAsNonRoot: true
  volumes:
  - name: logs
    emptyDir: {}
  containers:
  - name: app
    image: my-app:1.0
    volumeMounts:
    - name: logs
      mountPath: /logs
  - name: log-collector
    image: fluentd
    volumeMounts:
    - name: logs
      mountPath: /logs

Explanation:

  • All containers run as UID 1000 and GID 3000.

  • The logs volume is owned by GID 2000 (fsGroup), ensuring both containers can write to it.

  • runAsNonRoot: true enforces non-root execution, aligning with compliance requirements.


🖥️ 3. Container-Level SecurityContext

The container-level securityContext is defined under spec.containers[].securityContext and applies only to the specific container. It can override pod-level settings for that container but doesn’t affect volumes.

🛠️ Fields in Container-Level SecurityContext

Here’s a comprehensive list of fields available at the container level:

  1. runAsUser 👤:

    • Purpose: Overrides the pod-level runAsUser for the specific container.

    • Use Case: A specific container needs to run as a different user (e.g., root for administrative tasks).

    • Example: A pod with a sidecar requiring root privileges.

        apiVersion: v1
        kind: Pod
        metadata:
          name: mixed-user-pod
        spec:
          securityContext:
            runAsUser: 1000
          containers:
          - name: app
            image: nginx
            ports:
            - containerPort: 80
          - name: admin-tool
            image: busybox
            command: ["sh", "-c", "sleep 1h"]
            securityContext:
              runAsUser: 0  # Runs as root, overriding pod-level setting
      
  2. runAsGroup 👥:

    • Purpose: Overrides the pod-level runAsGroup for the container’s primary group ID.

    • Use Case: A container needs a different primary group for specific access requirements.

    • Example: A container accessing a volume with a unique group.

        apiVersion: v1
        kind: Pod
        metadata:
          name: custom-group-pod
        spec:
          securityContext:
            runAsGroup: 3000
          containers:
          - name: app
            image: busybox
            command: ["sh", "-c", "sleep 1h"]
            securityContext:
              runAsGroup: 4000  # Overrides pod-level runAsGroup
      
  3. runAsNonRoot 🔒:

    • Purpose: Enforces non-root execution for the specific container, overriding pod-level settings.

    • Use Case: Ensure a specific container adheres to non-root policies, even if the pod allows root.

    • Example: A sidecar container must run non-root for security.

        apiVersion: v1
        kind: Pod
        metadata:
          name: non-root-sidecar-pod
        spec:
          containers:
          - name: app
            image: nginx
          - name: sidecar
            image: busybox
            command: ["sh", "-c", "sleep 1h"]
            securityContext:
              runAsNonRoot: true
      
  4. capabilities ⚙️:

    • Purpose: Adds or drops Linux capabilities for the container.

    • Use Case: Grant specific privileges (e.g., NET_ADMIN) without full root access.

    • Example: A container needs to manage network interfaces.

        apiVersion: v1
        kind: Pod
        metadata:
          name: network-admin-pod
        spec:
          containers:
          - name: network-tool
            image: busybox
            command: ["sh", "-c", "sleep 1h"]
            securityContext:
              capabilities:
                add: ["NET_ADMIN"]  # Grants network administration privileges
                drop: ["ALL"]  # Drops all other capabilities
      
  5. privileged 🔓:

    • Purpose: Runs the container in privileged mode, granting full root privileges, similar to Docker’s --privileged flag.

    • Use Case: Rare cases where a container needs unrestricted access (e.g., running a system utility).

    • Example: A container running a system diagnostic tool.

        apiVersion: v1
        kind: Pod
        metadata:
          name: privileged-pod
        spec:
          containers:
          - name: diagnostic-tool
            image: busybox
            command: ["sh", "-c", "sleep 1h"]
            securityContext:
              privileged: true  # Full root privileges
      
  6. allowPrivilegeEscalation 🚫:

    • Purpose: Controls whether a process can gain more privileges than its parent (e.g., via setuid binaries). Set to false to prevent escalation.

    • Use Case: Prevent containers from escalating privileges in sensitive environments.

    • Example: A container running untrusted code.

        apiVersion: v1
        kind: Pod
        metadata:
          name: no-escalation-pod
        spec:
          containers:
          - name: app
            image: busybox
            command: ["sh", "-c", "sleep 1h"]
            securityContext:
              allowPrivilegeEscalation: false  # Prevents privilege escalation
      
  7. readOnlyRootFilesystem 📖:

    • Purpose: Mounts the container’s root filesystem as read-only, preventing modifications.

    • Use Case: Enhance security by ensuring the container cannot alter its filesystem.

    • Example: A stateless application container.

        apiVersion: v1
        kind: Pod
        metadata:
          name: readonly-pod
        spec:
          containers:
          - name: app
            image: nginx
            securityContext:
              readOnlyRootFilesystem: true  # Root filesystem is read-only
      
  8. seccompProfile 🔐:

    • Purpose: Specifies a Seccomp profile to filter system calls, enhancing security.

    • Options:

      • RuntimeDefault: Uses the container runtime’s default profile.

      • Unconfined: No Seccomp filtering.

      • Localhost: Uses a custom profile from the node.

    • Use Case: Restrict dangerous system calls in a container.

    • Example: A container with a default Seccomp profile.

        apiVersion: v1
        kind: Pod
        metadata:
          name: seccomp-pod
        spec:
          containers:
          - name: app
            image: busybox
            securityContext:
              seccompProfile:
                type: RuntimeDefault  # Apply default Seccomp profile
      
  9. appArmorProfile 🛡️:

    • Purpose: Applies an AppArmor profile to restrict the container’s capabilities.

    • Options: RuntimeDefault, Unconfined, or Localhost with a profile name.

    • Use Case: Restrict a container’s access in an AppArmor-enabled environment.

    • Example: A container with a custom AppArmor profile.

        apiVersion: v1
        kind: Pod
        metadata:
          name: apparmor-pod
        spec:
          containers:
          - name: app
            image: busybox
            securityContext:
              appArmorProfile:
                type: Localhost
                localhostProfile: k8s-apparmor-example-deny-write
      
  10. seLinuxOptions 🔐:

    • Purpose: Overrides pod-level SELinux labels for the container.

    • Use Case: Apply a specific SELinux label to a container in an SELinux-enabled cluster.

    • Example: A container requiring a unique SELinux label.

        apiVersion: v1
        kind: Pod
        metadata:
          name: selinux-container-pod
        spec:
          containers:
          - name: app
            image: busybox
            securityContext:
              seLinuxOptions:
                level: "s0:c789,c012"
      
  11. procMount 🖥️:

    • Purpose: Overrides pod-level procMount settings for the container.

    • Use Case: A specific container needs an unmasked /proc for nested container runtimes.

    • Example: A container running a nested Kubernetes cluster.

        apiVersion: v1
        kind: Pod
        metadata:
          name: nested-k8s-pod
        spec:
          containers:
          - name: k8s
            image: kindest/node
            securityContext:
              procMount: Unmasked  # Full /proc access
      

🌐 Real-Life Example for Container-Level SecurityContext

Scenario: A pod runs a web application (Nginx) and a monitoring tool requiring specific privileges (e.g., NET_ADMIN for network diagnostics).

apiVersion: v1
kind: Pod
metadata:
  name: web-monitor-pod
spec:
  securityContext:
    runAsUser: 1000
    runAsNonRoot: true
  containers:
  - name: nginx
    image: nginx
    ports:
    - containerPort: 80
  - name: monitor
    image: busybox
    command: ["sh", "-c", "sleep 1h"]
    securityContext:
      runAsUser: 2000  # Override pod-level runAsUser
      capabilities:
        add: ["NET_ADMIN"]  # Grant network privileges
      allowPrivilegeEscalation: false  # Prevent escalation

Explanation:

  • The pod-level runAsUser: 1000 applies to the Nginx container.

  • The monitor container overrides this with runAsUser: 2000 and adds NET_ADMIN for diagnostics.

  • allowPrivilegeEscalation: false ensures the monitor cannot gain additional privileges.


🔓 4. Privileged Mode

Privileged mode (privileged: true) grants a container full root privileges, equivalent to Docker’s --privileged flag. It bypasses most security restrictions, giving the container access to the host’s resources.

When to Use Privileged Mode

  • Use Case: Rare scenarios requiring unrestricted access, such as:

    • Running system utilities (e.g., kernel debugging tools).

    • Nested container runtimes (e.g., Docker-in-Docker).

    • Hardware access (e.g., GPU drivers).

  • Risks: Highly insecure, as it allows the container to affect the host system. Avoid unless absolutely necessary.

🌐 Example of Privileged Mode

Scenario: A pod running a Docker-in-Docker (DinD) setup for a CI/CD pipeline.

apiVersion: v1
kind: Pod
metadata:
  name: dind-pod
spec:
  containers:
  - name: docker
    image: docker:dind
    securityContext:
      privileged: true  # Full root privileges
    command: ["dockerd"]

Explanation:

  • The docker:dind image requires privileged mode to run the Docker daemon, which needs access to the host’s kernel and devices.

  • This setup is common in CI/CD pipelines (e.g., Jenkins) but should be tightly controlled due to security risks.


⚖️ 5. Pod-Level vs. Container-Level SecurityContext: Differences

AspectPod-Level SecurityContext 🗂️Container-Level SecurityContext 🖥️
ScopeApplies to all containers in the pod and volumes.Applies only to the specific container.
Fields AvailableIncludes fsGroup, supplementalGroups, seLinuxOptions, fsGroupChangePolicy, supplementalGroupsPolicy, procMount.Includes capabilities, privileged, readOnlyRootFilesystem, seccompProfile, appArmorProfile, and overrides for runAsUser, runAsGroup, runAsNonRoot, seLinuxOptions, procMount.
Volume ImpactAffects volume ownership and permissions (fsGroup, seLinuxOptions).Does not affect volumes.
Override BehaviorProvides default settings for all containers.Overrides pod-level settings for the container.
Use CaseSet baseline security for all containers and volumes (e.g., shared volume permissions).Customize security for a specific container (e.g., add capabilities or run as root).

Example of Pod vs. Container-Level Interaction:

apiVersion: v1
kind: Pod
metadata:
  name: mixed-security-pod
spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
  containers:
  - name: app
    image: nginx
  - name: privileged-tool
    image: busybox
    securityContext:
      runAsUser: 0  # Override to run as root
      privileged: true  # Full privileges
      capabilities:
        add: ["SYS_ADMIN"]

Explanation:

  • The app container uses the pod-level settings (runAsUser: 1000, runAsGroup: 3000).

  • The privileged-tool container overrides these with runAsUser: 0 and runs in privileged mode with additional capabilities.

  • The fsGroup: 2000 applies to any shared volumes, unaffected by container-level settings.


🤔 6. When to Use Pod-Level vs. Container-Level SecurityContext

  • Use Pod-Level SecurityContext 🗂️:

    • When all containers in the pod share common security settings (e.g., non-root execution, volume ownership).

    • For volume-related settings (fsGroup, seLinuxOptions) that apply across containers.

    • Example: A pod with multiple containers sharing a volume, requiring consistent user and group settings.

  • Use Container-Level SecurityContext 🖥️:

    • When a specific container needs different settings (e.g., one container needs NET_ADMIN or root privileges).

    • For container-specific restrictions like readOnlyRootFilesystem or seccompProfile.

    • Example: A pod where one container runs a privileged task while others are restricted.


7. Best Practices and Real-Life Considerations

  1. Minimize Privileges 🔒:

    • Avoid privileged: true unless absolutely necessary.

    • Use runAsNonRoot: true and drop unnecessary capabilities.

  2. Use Read-Only Filesystems 📖:

    • Set readOnlyRootFilesystem: true for containers that don’t need to write to their filesystem.
  3. Optimize Volume Permissions 💾:

    • Use fsGroupChangePolicy: OnRootMismatch for large volumes to reduce startup time.

    • Use supplementalGroupsPolicy: Strict to avoid unintended group memberships.

  4. Leverage Seccomp and AppArmor 🛡️:

    • Apply seccompProfile: RuntimeDefault and AppArmor profiles for additional security layers.
  5. SELinux in Secure Environments 🔐:

    • Use seLinuxOptions and seLinuxChangePolicy: Recursive in SELinux-enabled clusters for fine-grained control.
  6. Monitor and Audit 📊:

    • Use tools like kubectl describe pod and metrics (e.g., selinux_warning_controller_selinux_volume_conflict) to detect misconfigurations.

🎯 8. Conclusion

Pod-level SecurityContext 🗂️ is ideal for setting baseline security policies and managing volume permissions across all containers in a pod. Container-level SecurityContext 🖥️ allows fine-grained customization for individual containers, overriding pod-level settings when needed. Privileged mode 🔓 should be used sparingly due to its security risks.


1
Subscribe to my newsletter

Read articles from Omkar Shelke directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Omkar Shelke
Omkar Shelke