Mastering Kubernetes Security: A Deep Dive into SecurityContext

Table of contents
- 🛡️ 1. Introduction to SecurityContext in Kubernetes
- 📂 2. Pod-Level SecurityContext
- 🖥️ 3. Container-Level SecurityContext
- 🔓 4. Privileged Mode
- ⚖️ 5. Pod-Level vs. Container-Level SecurityContext: Differences
- 🤔 6. When to Use Pod-Level vs. Container-Level SecurityContext
- ✅ 7. Best Practices and Real-Life Considerations
- 🎯 8. Conclusion
🛡️ 1. Introduction to SecurityContext in Kubernetes
A SecurityContext in Kubernetes defines privilege and access control settings for pods or containers, allowing you to control how processes run, access resources, and interact with the system. It is a critical component for securing Kubernetes workloads by enforcing least-privilege principles.
Pod-Level SecurityContext 🗂️: Applies security settings to all containers in a pod and can affect the pod’s volumes. It’s defined under
spec.securityContext
.Container-Level SecurityContext 🖥️: Applies to a specific container and can override pod-level settings for that container. It’s defined under
spec.containers[].securityContext
.
The key difference is scope:
Pod-level settings provide a baseline for all containers and volumes in the pod.
Container-level settings allow fine-grained customization for individual containers, overriding pod-level settings where applicable.
📂 2. Pod-Level SecurityContext
The pod-level securityContext
is defined in the pod’s spec
and applies to all containers in the pod unless overridden by a container-level securityContext
. It also applies to certain volume-related settings (e.g., fsGroup
and seLinuxOptions
).
🛠️ Fields in Pod-Level SecurityContext
Here’s a comprehensive list of fields available at the pod level, their purpose, and examples:
runAsUser 👤:
Purpose: Specifies the user ID (UID) for all containers’ processes in the pod.
Use Case: Ensures containers don’t run as root, reducing the risk of privilege escalation.
Example: A web server pod where all containers should run as a non-root user for security.
apiVersion: v1 kind: Pod metadata: name: web-server-pod spec: securityContext: runAsUser: 1000 # All containers run as UID 1000 containers: - name: nginx image: nginx ports: - containerPort: 80
In this example, a web server (e.g., Nginx) runs as UID 1000, preventing root-level access even if the container is compromised.
runAsGroup 👥:
Purpose: Sets the primary group ID (GID) for all containers’ processes.
Use Case: Controls group ownership for files created by containers, useful for shared volumes.
Example: A pod with a shared volume where files need consistent group ownership.
apiVersion: v1 kind: Pod metadata: name: shared-volume-pod spec: securityContext: runAsUser: 1000 runAsGroup: 3000 # Primary group ID for processes volumes: - name: shared-data emptyDir: {} containers: - name: app image: busybox command: ["sh", "-c", "echo hello > /data/testfile && sleep 1h"] volumeMounts: - name: shared-data mountPath: /data
Files created in the
/data
volume will be owned by GID 3000, ensuring consistent group access.
runAsNonRoot 🔒:
Purpose: Ensures all containers run as a non-root user (UID ≠ 0). If set to
true
, Kubernetes rejects the pod if any container tries to run as root.Use Case: Enforce a policy where no container in the pod can run as root.
Example: A corporate policy requires all pods to run non-root for compliance.
apiVersion: v1 kind: Pod metadata: name: non-root-pod spec: securityContext: runAsNonRoot: true # Enforces non-root user containers: - name: app image: nginx ports: - containerPort: 80
If the container tries to run as root, the pod will fail to start.
fsGroup 💾:
Purpose: Sets the group ID for volume ownership and permissions. Kubernetes applies this GID to volumes that support ownership management (e.g.,
emptyDir
,persistentVolumeClaim
).Use Case: Ensures files in a shared volume are accessible by a specific group, such as in a multi-container pod.
Example: A pod with a shared volume for a data processing application.
apiVersion: v1 kind: Pod metadata: name: data-processing-pod spec: securityContext: runAsUser: 1000 fsGroup: 2000 # Volume files owned by GID 2000 volumes: - name: data-vol emptyDir: {} containers: - name: processor image: busybox command: ["sh", "-c", "echo data > /data/output && sleep 1h"] volumeMounts: - name: data-vol mountPath: /data
Files in
/data
will be owned by GID 2000, ensuring group-level access control.
supplementalGroups 👨👩👧👦:
Purpose: Adds additional group IDs to container processes, beyond the primary
runAsGroup
.Use Case: Grants access to resources owned by multiple groups, such as shared storage.
Example: A pod accessing multiple shared volumes with different group ownerships.
apiVersion: v1 kind: Pod metadata: name: multi-group-pod spec: securityContext: runAsUser: 1000 runAsGroup: 3000 supplementalGroups: [4000, 5000] # Additional group memberships containers: - name: app image: busybox command: ["sh", "-c", "sleep 1h"]
Processes in the container belong to GIDs 3000, 4000, and 5000, allowing access to resources owned by these groups.
supplementalGroupsPolicy (Kubernetes v1.33+, beta) 🛠️:
Purpose: Controls how supplementary groups are calculated. Options are:
Merge
: Merges groups from the container image’s/etc/group
withfsGroup
andsupplementalGroups
.Strict
: Only uses groups specified infsGroup
,supplementalGroups
, orrunAsGroup
, ignoring/etc/group
.
Use Case: Avoid unintended group memberships from the container image for stricter security.
Example: A pod requiring strict group control for compliance.
apiVersion: v1 kind: Pod metadata: name: strict-groups-pod spec: securityContext: runAsUser: 1000 runAsGroup: 3000 supplementalGroups: [4000] supplementalGroupsPolicy: Strict # Only specified groups are used containers: - name: app image: busybox command: ["sh", "-c", "sleep 1h"]
The container process will only have GIDs 3000 and 4000, ignoring any groups defined in the image’s
/etc/group
.
fsGroupChangePolicy ⚙️:
Purpose: Controls how Kubernetes changes ownership and permissions for volumes. Options are:
OnRootMismatch
: Only changes permissions if the volume’s root directory doesn’t match the expectedfsGroup
.Always
: Always changes permissions when the volume is mounted.
Use Case: Optimize pod startup time for large volumes by reducing unnecessary permission changes.
Example: A pod with a large persistent volume.
apiVersion: v1 kind: Pod metadata: name: large-volume-pod spec: securityContext: runAsUser: 1000 fsGroup: 2000 fsGroupChangePolicy: OnRootMismatch # Optimize permission changes volumes: - name: data persistentVolumeClaim: claimName: data-pvc containers: - name: app image: busybox volumeMounts: - name: data mountPath: /data
This reduces startup time by only changing permissions when necessary.
seLinuxOptions 🔐:
Purpose: Assigns SELinux labels to containers and volumes for access control.
Use Case: Enforce mandatory access control in environments with SELinux enabled (e.g., Red Hat systems).
Example: A pod running in an SELinux-enabled cluster.
apiVersion: v1 kind: Pod metadata: name: selinux-pod spec: securityContext: seLinuxOptions: level: "s0:c123,c456" # SELinux label for processes and volumes containers: - name: app image: busybox command: ["sh", "-c", "sleep 1h"]
All containers and volumes use the specified SELinux label, ensuring compliance with SELinux policies.
seLinuxChangePolicy (Kubernetes v1.33+, beta) 🔄:
Purpose: Controls SELinux relabelling behaviour. Options are:
MountOption
: Uses mount options for faster relabelling (requiresSELinuxMount
feature gate).Recursive
: Recursively relabels all files in the volume.
Use Case: Optimize SELinux relabelling for performance or allow multiple pods with different labels to share a volume.
Example: A pod opting out of mount-based relabelling for compatibility.
apiVersion: v1 kind: Pod metadata: name: selinux-recursive-pod spec: securityContext: seLinuxOptions: level: "s0:c123,c456" seLinuxChangePolicy: Recursive # Recursive relabeling containers: - name: app image: busybox command: ["sh", "-c", "sleep 1h"]
This ensures recursive relabeling, allowing multiple pods with different SELinux labels to share a volume.
procMount (Kubernetes v1.33+, beta) 🖥️:
Purpose: Controls the
/proc
filesystem’s mount behaviour. Options are:Default
: Masks certain/proc
paths (e.g.,/proc/kcore
) and makes others read-only.Unmasked
: Exposes all/proc
paths, useful for nested container runtimes.
Use Case: Running containers within containers (e.g., Docker-in-Docker).
Example: A pod running a CI/CD pipeline with nested containers.
apiVersion: v1 kind: Pod metadata: name: dind-pod spec: securityContext: procMount: Unmasked # Expose full /proc hostUsers: false # Required for Unmasked containers: - name: docker image: docker:dind command: ["dockerd"]
This allows the Docker daemon to access the full
/proc
filesystem for container management.
🌐 Real-Life Example for Pod-Level SecurityContext
Scenario: A company runs a microservices application with multiple pods, each containing multiple containers (e.g., an app and a logging sidecar). To comply with security policies, all containers must run as non-root, and shared volumes must be accessible by a specific group.
apiVersion: v1
kind: Pod
metadata:
name: microservice-pod
spec:
securityContext:
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
runAsNonRoot: true
volumes:
- name: logs
emptyDir: {}
containers:
- name: app
image: my-app:1.0
volumeMounts:
- name: logs
mountPath: /logs
- name: log-collector
image: fluentd
volumeMounts:
- name: logs
mountPath: /logs
Explanation:
All containers run as UID 1000 and GID 3000.
The
logs
volume is owned by GID 2000 (fsGroup
), ensuring both containers can write to it.runAsNonRoot: true
enforces non-root execution, aligning with compliance requirements.
🖥️ 3. Container-Level SecurityContext
The container-level securityContext
is defined under spec.containers[].securityContext
and applies only to the specific container. It can override pod-level settings for that container but doesn’t affect volumes.
🛠️ Fields in Container-Level SecurityContext
Here’s a comprehensive list of fields available at the container level:
runAsUser 👤:
Purpose: Overrides the pod-level
runAsUser
for the specific container.Use Case: A specific container needs to run as a different user (e.g., root for administrative tasks).
Example: A pod with a sidecar requiring root privileges.
apiVersion: v1 kind: Pod metadata: name: mixed-user-pod spec: securityContext: runAsUser: 1000 containers: - name: app image: nginx ports: - containerPort: 80 - name: admin-tool image: busybox command: ["sh", "-c", "sleep 1h"] securityContext: runAsUser: 0 # Runs as root, overriding pod-level setting
runAsGroup 👥:
Purpose: Overrides the pod-level
runAsGroup
for the container’s primary group ID.Use Case: A container needs a different primary group for specific access requirements.
Example: A container accessing a volume with a unique group.
apiVersion: v1 kind: Pod metadata: name: custom-group-pod spec: securityContext: runAsGroup: 3000 containers: - name: app image: busybox command: ["sh", "-c", "sleep 1h"] securityContext: runAsGroup: 4000 # Overrides pod-level runAsGroup
runAsNonRoot 🔒:
Purpose: Enforces non-root execution for the specific container, overriding pod-level settings.
Use Case: Ensure a specific container adheres to non-root policies, even if the pod allows root.
Example: A sidecar container must run non-root for security.
apiVersion: v1 kind: Pod metadata: name: non-root-sidecar-pod spec: containers: - name: app image: nginx - name: sidecar image: busybox command: ["sh", "-c", "sleep 1h"] securityContext: runAsNonRoot: true
capabilities ⚙️:
Purpose: Adds or drops Linux capabilities for the container.
Use Case: Grant specific privileges (e.g.,
NET_ADMIN
) without full root access.Example: A container needs to manage network interfaces.
apiVersion: v1 kind: Pod metadata: name: network-admin-pod spec: containers: - name: network-tool image: busybox command: ["sh", "-c", "sleep 1h"] securityContext: capabilities: add: ["NET_ADMIN"] # Grants network administration privileges drop: ["ALL"] # Drops all other capabilities
privileged 🔓:
Purpose: Runs the container in privileged mode, granting full root privileges, similar to Docker’s
--privileged
flag.Use Case: Rare cases where a container needs unrestricted access (e.g., running a system utility).
Example: A container running a system diagnostic tool.
apiVersion: v1 kind: Pod metadata: name: privileged-pod spec: containers: - name: diagnostic-tool image: busybox command: ["sh", "-c", "sleep 1h"] securityContext: privileged: true # Full root privileges
allowPrivilegeEscalation 🚫:
Purpose: Controls whether a process can gain more privileges than its parent (e.g., via
setuid
binaries). Set tofalse
to prevent escalation.Use Case: Prevent containers from escalating privileges in sensitive environments.
Example: A container running untrusted code.
apiVersion: v1 kind: Pod metadata: name: no-escalation-pod spec: containers: - name: app image: busybox command: ["sh", "-c", "sleep 1h"] securityContext: allowPrivilegeEscalation: false # Prevents privilege escalation
readOnlyRootFilesystem 📖:
Purpose: Mounts the container’s root filesystem as read-only, preventing modifications.
Use Case: Enhance security by ensuring the container cannot alter its filesystem.
Example: A stateless application container.
apiVersion: v1 kind: Pod metadata: name: readonly-pod spec: containers: - name: app image: nginx securityContext: readOnlyRootFilesystem: true # Root filesystem is read-only
seccompProfile 🔐:
Purpose: Specifies a Seccomp profile to filter system calls, enhancing security.
Options:
RuntimeDefault
: Uses the container runtime’s default profile.Unconfined
: No Seccomp filtering.Localhost
: Uses a custom profile from the node.
Use Case: Restrict dangerous system calls in a container.
Example: A container with a default Seccomp profile.
apiVersion: v1 kind: Pod metadata: name: seccomp-pod spec: containers: - name: app image: busybox securityContext: seccompProfile: type: RuntimeDefault # Apply default Seccomp profile
appArmorProfile 🛡️:
Purpose: Applies an AppArmor profile to restrict the container’s capabilities.
Options:
RuntimeDefault
,Unconfined
, orLocalhost
with a profile name.Use Case: Restrict a container’s access in an AppArmor-enabled environment.
Example: A container with a custom AppArmor profile.
apiVersion: v1 kind: Pod metadata: name: apparmor-pod spec: containers: - name: app image: busybox securityContext: appArmorProfile: type: Localhost localhostProfile: k8s-apparmor-example-deny-write
seLinuxOptions 🔐:
Purpose: Overrides pod-level SELinux labels for the container.
Use Case: Apply a specific SELinux label to a container in an SELinux-enabled cluster.
Example: A container requiring a unique SELinux label.
apiVersion: v1 kind: Pod metadata: name: selinux-container-pod spec: containers: - name: app image: busybox securityContext: seLinuxOptions: level: "s0:c789,c012"
procMount 🖥️:
Purpose: Overrides pod-level
procMount
settings for the container.Use Case: A specific container needs an unmasked
/proc
for nested container runtimes.Example: A container running a nested Kubernetes cluster.
apiVersion: v1 kind: Pod metadata: name: nested-k8s-pod spec: containers: - name: k8s image: kindest/node securityContext: procMount: Unmasked # Full /proc access
🌐 Real-Life Example for Container-Level SecurityContext
Scenario: A pod runs a web application (Nginx) and a monitoring tool requiring specific privileges (e.g., NET_ADMIN
for network diagnostics).
apiVersion: v1
kind: Pod
metadata:
name: web-monitor-pod
spec:
securityContext:
runAsUser: 1000
runAsNonRoot: true
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
- name: monitor
image: busybox
command: ["sh", "-c", "sleep 1h"]
securityContext:
runAsUser: 2000 # Override pod-level runAsUser
capabilities:
add: ["NET_ADMIN"] # Grant network privileges
allowPrivilegeEscalation: false # Prevent escalation
Explanation:
The pod-level
runAsUser: 1000
applies to the Nginx container.The
monitor
container overrides this withrunAsUser: 2000
and addsNET_ADMIN
for diagnostics.allowPrivilegeEscalation: false
ensures the monitor cannot gain additional privileges.
🔓 4. Privileged Mode
Privileged mode (privileged: true
) grants a container full root privileges, equivalent to Docker’s --privileged
flag. It bypasses most security restrictions, giving the container access to the host’s resources.
❓ When to Use Privileged Mode
Use Case: Rare scenarios requiring unrestricted access, such as:
Running system utilities (e.g., kernel debugging tools).
Nested container runtimes (e.g., Docker-in-Docker).
Hardware access (e.g., GPU drivers).
Risks: Highly insecure, as it allows the container to affect the host system. Avoid unless absolutely necessary.
🌐 Example of Privileged Mode
Scenario: A pod running a Docker-in-Docker (DinD) setup for a CI/CD pipeline.
apiVersion: v1
kind: Pod
metadata:
name: dind-pod
spec:
containers:
- name: docker
image: docker:dind
securityContext:
privileged: true # Full root privileges
command: ["dockerd"]
Explanation:
The
docker:dind
image requires privileged mode to run the Docker daemon, which needs access to the host’s kernel and devices.This setup is common in CI/CD pipelines (e.g., Jenkins) but should be tightly controlled due to security risks.
⚖️ 5. Pod-Level vs. Container-Level SecurityContext: Differences
Aspect | Pod-Level SecurityContext 🗂️ | Container-Level SecurityContext 🖥️ |
Scope | Applies to all containers in the pod and volumes. | Applies only to the specific container. |
Fields Available | Includes fsGroup , supplementalGroups , seLinuxOptions , fsGroupChangePolicy , supplementalGroupsPolicy , procMount . | Includes capabilities , privileged , readOnlyRootFilesystem , seccompProfile , appArmorProfile , and overrides for runAsUser , runAsGroup , runAsNonRoot , seLinuxOptions , procMount . |
Volume Impact | Affects volume ownership and permissions (fsGroup , seLinuxOptions ). | Does not affect volumes. |
Override Behavior | Provides default settings for all containers. | Overrides pod-level settings for the container. |
Use Case | Set baseline security for all containers and volumes (e.g., shared volume permissions). | Customize security for a specific container (e.g., add capabilities or run as root). |
Example of Pod vs. Container-Level Interaction:
apiVersion: v1
kind: Pod
metadata:
name: mixed-security-pod
spec:
securityContext:
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
containers:
- name: app
image: nginx
- name: privileged-tool
image: busybox
securityContext:
runAsUser: 0 # Override to run as root
privileged: true # Full privileges
capabilities:
add: ["SYS_ADMIN"]
Explanation:
The
app
container uses the pod-level settings (runAsUser: 1000
,runAsGroup: 3000
).The
privileged-tool
container overrides these withrunAsUser: 0
and runs in privileged mode with additional capabilities.The
fsGroup: 2000
applies to any shared volumes, unaffected by container-level settings.
🤔 6. When to Use Pod-Level vs. Container-Level SecurityContext
Use Pod-Level SecurityContext 🗂️:
When all containers in the pod share common security settings (e.g., non-root execution, volume ownership).
For volume-related settings (
fsGroup
,seLinuxOptions
) that apply across containers.Example: A pod with multiple containers sharing a volume, requiring consistent user and group settings.
Use Container-Level SecurityContext 🖥️:
When a specific container needs different settings (e.g., one container needs
NET_ADMIN
or root privileges).For container-specific restrictions like
readOnlyRootFilesystem
orseccompProfile
.Example: A pod where one container runs a privileged task while others are restricted.
✅ 7. Best Practices and Real-Life Considerations
Minimize Privileges 🔒:
Avoid
privileged: true
unless absolutely necessary.Use
runAsNonRoot: true
and drop unnecessary capabilities.
Use Read-Only Filesystems 📖:
- Set
readOnlyRootFilesystem: true
for containers that don’t need to write to their filesystem.
- Set
Optimize Volume Permissions 💾:
Use
fsGroupChangePolicy: OnRootMismatch
for large volumes to reduce startup time.Use
supplementalGroupsPolicy: Strict
to avoid unintended group memberships.
Leverage Seccomp and AppArmor 🛡️:
- Apply
seccompProfile: RuntimeDefault
and AppArmor profiles for additional security layers.
- Apply
SELinux in Secure Environments 🔐:
- Use
seLinuxOptions
andseLinuxChangePolicy: Recursive
in SELinux-enabled clusters for fine-grained control.
- Use
Monitor and Audit 📊:
- Use tools like
kubectl describe pod
and metrics (e.g.,selinux_warning_controller_selinux_volume_conflict
) to detect misconfigurations.
- Use tools like
🎯 8. Conclusion
Pod-level SecurityContext 🗂️ is ideal for setting baseline security policies and managing volume permissions across all containers in a pod. Container-level SecurityContext 🖥️ allows fine-grained customization for individual containers, overriding pod-level settings when needed. Privileged mode 🔓 should be used sparingly due to its security risks.
Subscribe to my newsletter
Read articles from Omkar Shelke directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
