Kubernetes Storage and Security

Aditya MuraliAditya Murali
17 min read

Why do we need storage in Kubernetes?

We need storage in Kubernetes to provide a way for applications to store and access persistent data. Persistent storage is important for many types of applications, including databases, file systems, and other stateful applications.

Kubernetes storage enables applications to store and retrieve data even if the underlying containers or nodes fail. This is because Kubernetes can automatically manage the storage resources and ensure that data is replicated and available even in the event of a failure.

In addition, Kubernetes storage allows applications to be designed to be more scalable and flexible. For example, applications can be designed to scale horizontally by adding more instances of the application, each with its own storage volume. This allows the application to handle more load and provide more capacity for storing data.

Kubernetes storage also provides a way for administrators to manage storage resources more efficiently. By using storage classes and dynamic provisioning, administrators can automatically allocate storage resources as needed, without having to manually provision and manage each storage volume.

Types of Storage and its Importance

There are several types of Kubernetes storage available, including:

  1. Persistent Volume (PV) and Persistent Volume Claim (PVC)

  2. StorageClass

Persistent Volumes and Persistent Volume Claims

A Persistent Volume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using a storage class. PVs are abstract representations of physical storage devices, such as disks or SSDs, and can be used by applications to store data persistently.

Here is an example to illustrate how PVs work in Kubernetes:

Let's say you have a Kubernetes cluster running multiple nodes, and you want to deploy a stateful application that requires persistent storage. You could create a Persistent Volume by first defining a YAML file like this:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: /data/my-pv

This YAML file specifies the following:

  • apiVersion and kind: These fields specify that we are creating a PersistentVolume resource.

  • metadata: This field specifies the name of the Persistent Volume.

  • spec: This field specifies the details of the Persistent Volume, such as its capacity and access modes.

In this example, we are creating a Persistent Volume named "my-pv" with a capacity of 10 gigabytes, and with access mode "ReadWriteOnce". This means that the volume can be mounted by a single node at a time and can be both read from and written to.

We are using the hostPath field to specify that the Persistent Volume will be backed by a directory on the node's file system. In this case, the directory is /data/my-pv.

Once you have defined your Persistent Volume, you can create a Persistent Volume Claim (PVC) to request storage from it. The PVC can be defined like this:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

This YAML file specifies the following:

  • apiVersion and kind: These fields specify that we are creating a PersistentVolumeClaim resource.

  • metadata: This field specifies the name of the Persistent Volume Claim.

  • spec: This field specifies the details of the Persistent Volume Claim, such as its access modes and the amount of storage requested.

In this example, we are creating a Persistent Volume Claim named "my-pvc" with access mode "ReadWriteOnce" and requesting 5 gigabytes of storage.

When you create the PVC, Kubernetes will look for a PV that matches the access modes and capacity requested by the PVC. If a suitable PV is found, it will be bound to the PVC, and the PVC can then be used by your application to store data.

Here is an overview of how the binding of PVs and PVCs works in Kubernetes:

  1. The administrator creates a Persistent Volume (PV) with a specific capacity, access modes, and storage type.

  2. The user creates a Persistent Volume Claim (PVC) that specifies the required capacity, access modes, and optionally, a storage class.

  3. Kubernetes matches the PVC to a suitable PV by checking if the PV's capacity, access modes, and storage class (if specified) match the PVC's requirements.

  4. If a matching PV is found, Kubernetes binds the PV to the PVC. The PVC then becomes "bound" to the PV, and the user can use the PVC to store data.

  5. If no matching PV is found, Kubernetes will either dynamically provision a new PV that matches the PVC's requirements (if a storage class is specified), or the PVC remains unbound and the user cannot use it to store data.

Once a PV is bound to a PVC, the user can use the PVC to store data in a container running in a pod. The user specifies the PVC name as the storage volume in the pod's YAML file, and Kubernetes will automatically mount the PVC to the container.

So, in summary, a Persistent Volume in Kubernetes is a way to provide persistent storage for applications running in a cluster. It allows applications to store data even if the underlying containers or nodes fail.

StorageClass

A storage class is a way to define the type and configuration of the storage that will be used to store persistent data in Kubernetes. It provides an abstraction layer between the application and the storage system, making it easier to manage and scale storage resources.

Here is an example of a StorageClass definition in Kubernetes:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: my-storage-class
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
  zone: us-west-2a

This example defines a storage class called "my-storage-class" that uses the AWS Elastic Block Store (EBS) provisioner to provision persistent volumes. The storage class is configured to use the gp2 volume type and the us-west-2a availability zone.

When a pod requests storage, it can specify the storage class to use, and Kubernetes will dynamically provision a persistent volume that meets the requirements of that storage class. This allows for easy management and scaling of storage resources in Kubernetes.

Container Storage Interface

The Container Storage Interface (CSI) in Kubernetes is a standard interface for connecting container orchestrators, like Kubernetes, with external storage systems. It allows Kubernetes to provision and manage storage resources from different storage vendors and technologies in a standardized way.

Before CSI, Kubernetes provided two storage plugins: in-tree volume plugins and FlexVolume plugins. In-tree volume plugins were integrated directly into the Kubernetes codebase, which made it difficult to add new plugins or update existing ones. FlexVolume plugins were more flexible, but still had limitations and were complex to maintain. CSI addresses these issues by providing a standardized interface that decouples the storage vendor's driver from the Kubernetes core.

With CSI, storage vendors can create their own CSI driver, which can be used by Kubernetes to provision and manage storage resources from the vendor's storage system. This allows for more flexibility and scalability in managing storage in The Container Storage Interface (CSI) works by providing a standard interface for communication between Kubernetes and external storage systems. CSI drivers are responsible for implementing this interface and enabling Kubernetes to manage storage resources from different vendors and technologies.

Here is a high-level overview of how CSI works in Kubernetes:

  1. A user creates a PersistentVolumeClaim (PVC) object that requests storage resources with specific parameters, such as storage class, access mode, and capacity.

  2. The Kubernetes scheduler uses a CSI driver to provision a PersistentVolume (PV) that satisfies the PVC requirements. The CSI driver communicates with the external storage system to create a new PV with the requested storage resources.

  3. Once the PV is provisioned, the user creates a Pod that requires persistent storage, and includes the PVC in the Pod specification.

  4. The Kubernetes scheduler uses the CSI driver to mount the PV to the Pod. The CSI driver ensures that the Pod has access to the correct volume and can read and write data to it.

  5. If the Pod is deleted, the Kubernetes scheduler uses the CSI driver to unmount the volume and release the resources back to the storage system.

  6. If the PV is no longer needed, the Kubernetes administrator can delete the PV and the CSI driver will communicate with the storage system to release the resources.

Statefulesets in Kubernetes

StatefulSets are a type of controller that manage a set of stateful pods, providing guarantees for the stable, unique identities of each pod in the set and support ordered, graceful deployment and scaling of the pods.

Let's say you have a stateful application that requires persistent storage, and you want to deploy it in a Kubernetes cluster using StatefulSets. Here are the steps you can follow:

  1. Create a PersistentVolumeClaim (PVC) that defines the storage requirements for your application. For example:

     apiVersion: v1
     kind: PersistentVolumeClaim
     metadata:
       name: my-app-pvc
     spec:
       accessModes:
       - ReadWriteOnce
       resources:
         requests:
           storage: 1Gi
    
  2. Create a StatefulSet that defines the deployment and scaling of your application. For example:

     apiVersion: apps/v1
     kind: StatefulSet
     metadata:
       name: my-app
     spec:
       replicas: 3
       selector:
         matchLabels:
           app: my-app
       serviceName: my-app
       template:
         metadata:
           labels:
             app: my-app
         spec:
           containers:
           - name: my-app
             image: my-app-image:latest
             ports:
             - containerPort: 8080
             volumeMounts:
             - name: my-app-storage
               mountPath: /data
       volumeClaimTemplates:
       - metadata:
           name: my-app-storage
         spec:
           accessModes:
           - ReadWriteOnce
           resources:
             requests:
               storage: 1Gi
    

    This StatefulSet will deploy three replicas of your application, each with a unique hostname based on the StatefulSet name and the replica index (e.g. my-app-0, my-app-1, my-app-2). The StatefulSet also defines a volumeClaimTemplate that will dynamically create a PVC for each replica.

  3. Apply the PVC and StatefulSet manifests to your Kubernetes cluster:

     kubectl apply -f my-app-pvc.yaml
     kubectl apply -f my-app-statefulset.yaml
    
  4. Check the status of the StatefulSet:

     kubectl get statefulset my-app
    

    This should show you the current status of the StatefulSet, including the number of replicas running and their current state.

With this example, you now have a stateful application running in a Kubernetes cluster, with persistent storage provided by a StatefulSet and PVCs. Each replica has its own persistent storage, and you can scale the application up or down by adjusting the replicas value in the StatefulSet specification.

What is Security in kubernetes?

Kubernetes security refers to the various mechanisms and best practices used to protect the Kubernetes infrastructure, the applications running on it, and the sensitive data stored within the system.

There are several reasons why security is essential in Kubernetes:

  1. Protection from attacks: Kubernetes is vulnerable to various types of attacks such as unauthorized access, privilege escalation, and denial of service (DoS) attacks. Proper security measures help prevent these attacks from being successful.

  2. Compliance: Many organizations are subject to regulatory compliance requirements, such as HIPAA, GDPR, and PCI DSS, which require specific security measures to be in place. Kubernetes security helps organizations meet these requirements.

  3. Data protection: Kubernetes can store sensitive data such as user credentials, API keys, and encryption keys. Proper security measures help protect this data from unauthorized access and misuse.

  4. Reputation: Security breaches can damage an organization's reputation, leading to loss of customers and revenue. By implementing robust security measures in Kubernetes, organizations can protect their reputation and maintain the trust of their customers.

Types of Security

Here are some examples of Kubernetes security measures and best practices that you can implement:

  1. RBAC (Role-Based Access Control): RBAC allows you to control access to Kubernetes resources by defining roles and permissions for different users or groups. For example, you can create a role that allows a developer too view and modify resources in a specific namespace, but not delete them.

  2. Network Policies: Network policies allow you to control network traffic between Kubernetes pods and external networks. You can use network policies to limit access to sensitive resources and prevent unauthorized access. For example, you can create a network policy that only allows traffic from specific IP addresses or ports.

  3. Secrets Management: Kubernetes provides a built-in secrets management system that allows you to securely store and manage sensitive information such as passwords, API keys, and certificates. You can use secrets to authenticate with external services, encrypt sensitive data, and control access to Kubernetes resources.

  4. Container Image Scanning: You can use container image scanning tools like Trivy, Aqua Security, or Anchore to scan your container images for vulnerabilities, malware, or other security issues before deploying them to your Kubernetes cluster.

  5. Pod Security Policies: Pod Security Policies (PSPs) are used to enforce security policies on the pods that run on a Kubernetes cluster. PSPs allow you to control the security settings of your pods, such as allowing or denying privileged containers, setting file permissions, and limiting the use of host resources.

  6. TLS Encryption: Kubernetes allows you to enable TLS encryption for communication between Kubernetes components and between your applications and the Kubernetes API server. You can use TLS certificates to secure your network traffic and prevent unauthorized access.

  7. Container Runtime Security: You can use container runtime security tools like SELinux, AppArmor, or seccomp to restrict the actions that a container can perform. These tools provide additional layers of security by limiting the privileges of the containers running on your Kubernetes cluster.

How to implement security

RBAC (Role Based Access Controls)

  1. Create a new namespace called "test" using the following command:

     kubectl create namespace test
    
  2. Create a new role named "test-role" that allows a user to view, create, and delete pods in the "test" namespace:

     apiVersion: rbac.authorization.k8s.io/v1
     kind: Role
     metadata:
       name: test-role
       namespace: test
     rules:
     - apiGroups: [""]
       resources: ["pods"]
       verbs: ["get", "create", "delete"]
    
  3. Create a new user named "test-user" using the following command:

     kubectl create user test-user
    
  4. Create a new role binding named "test-rolebinding" that binds the "test-role" to the "test-user":

     apiVersion: rbac.authorization.k8s.io/v1
     kind: RoleBinding
     metadata:
       name: test-rolebinding
       namespace: test
     subjects:
     - kind: User
       name: test-user
       apiGroup: rbac.authorization.k8s.io
     roleRef:
       kind: Role
       name: test-role
       apiGroup: rbac.authorization.k8s.io
    

Now, the "test-user" has permissions to view, create, and delete pods in the "test" namespace. You can test this by switching to the "test-user" context and running the following command:

kubectl get pods -n test

This should show you the list of pods in the "test" namespace. If you try to run the same command in a different namespace or with a different user, you will receive a "forbidden" error message.

This is just a simple example of how to use RBAC in Kubernetes. RBAC can be used to define more complex roles and permissions for different users and groups, and can help you control access to your Kubernetes resources.

Network Policies

  1. Create a new namespace called "test" using the following command:

     kubectl create namespace test
    
  2. Create a network policy that allows traffic only from pods labeled "app=test" to other pods in the "test" namespace:

     apiVersion: networking.k8s.io/v1
     kind: NetworkPolicy
     metadata:
       name: test-networkpolicy
       namespace: test
     spec:
       podSelector:
         matchLabels:
           app: test
       policyTypes:
       - Ingress
       ingress:
       - from:
         - podSelector:
             matchLabels:
               app: test
    
  3. Deploy two sample pods in the "test" namespace, one with the label "app=test" and one without:

     kubectl run nginx --image=nginx --labels=app=test --namespace=test
     kubectl run busybox --image=busybox --namespace=test
    
  4. Try to ping the nginx pod from the busybox pod:

     kubectl exec -it busybox --namespace=test -- ping nginx
    

    This command should fail because the network policy we created in step 2 only allows traffic from pods labeled "app=test".

  5. Label the busybox pod with the "app=test" label:

     kubectl label pod busybox app=test --namespace=test
    
  6. Try to ping the nginx pod from the busybox pod again:

     kubectl exec -it busybox --namespace=test -- ping nginx
    

    This time the command should succeed because the busybox pod now has the required label and is allowed to communicate with the nginx pod according to the network policy.

This is just a simple example of how to use network policies in Kubernetes. Network policies can be used to define more complex rules for controlling traffic between pods in your cluster, and can help you ensure the security and reliability of your applications.

Secret Management

  1. Create a new Secret object with the following command:

     kubectl create secret generic mysecret --from-literal=username=myusername --from-literal=password=mypassword
    

    This creates a new Secret object named "mysecret" with two key-value pairs: "username" and "password". The values for these keys are set to "myusername" and "mypassword", respectively.

  2. Use the Secret in a Pod by referencing it in the Pod's YAML definition file. For example:

     apiVersion: v1
     kind: Pod
     metadata:
       name: mypod
     spec:
       containers:
       - name: mycontainer
         image: myimage
         env:
         - name: MY_USERNAME
           valueFrom:
             secretKeyRef:
               name: mysecret
               key: username
         - name: MY_PASSWORD
           valueFrom:
             secretKeyRef:
               name: mysecret
               key: password
    

    This YAML file creates a new Pod named "mypod" with a container named "mycontainer". The container's image is set to "myimage", and two environment variables are defined: "MY_USERNAME" and "MY_PASSWORD". The values for these environment variables are set using the "valueFrom" field, which references the "mysecret" Secret object and the "username" and "password" keys.

    When the Pod is created, Kubernetes will automatically mount the "mysecret" Secret object as a file at the path "/var/run/secrets/kubernetes.io/mysecret". The container can then access the Secret's key-value pairs by reading this file.

    In this example, the container reads the "username" and "password" keys from the mounted file and sets the "MY_USERNAME" and "MY_PASSWORD" environment variables, respectively.

Secrets can also be used to store other sensitive data like API keys, certificates, etc. The above example demonstrates how Secrets can be used to store and access sensitive data within a Kubernetes Pod.

Pod Security Policies

  1. Create a new Pod Security Policy (PSP) object with the following YAML definition:

     apiVersion: policy/v1beta1
     kind: PodSecurityPolicy
     metadata:
       name: example-psp
     spec:
       privileged: false
       allowPrivilegeEscalation: false
       runAsUser:
         rule: 'RunAsAny'
       seLinux:
         rule: 'RunAsAny'
       fsGroup:
         rule: 'RunAsAny'
       volumes:
       - '*'
    

    This PSP object is named "example-psp" and defines a set of security policies that all Pods in your cluster must adhere to. In this example, the policies include disallowing privileged containers, disallowing privilege escalation, allowing any user to run the Pod (via the "RunAsAny" rule for the "runAsUser" field), allowing any SELinux context to be used (via the "RunAsAny" rule for the "seLinux" field), allowing any group ID to be used (via the "RunAsAny" rule for the "fsGroup" field), and allowing any volume type to be used (via the "*" wildcard in the "volumes" field).

  2. Bind the PSP to a specific service account with the following YAML definition:

     apiVersion: rbac.authorization.k8s.io/v1
     kind: RoleBinding
     metadata:
       name: example-psp-binding
     roleRef:
       apiGroup: rbac.authorization.k8s.io
       kind: ClusterRole
       name: psp:example-psp
     subjects:
     - kind: ServiceAccount
       name: default
       namespace: default
    

    This RoleBinding object binds the "example-psp" PSP to the "default" service account in the "default" namespace. This means that any Pod created by the "default" service account must adhere to the policies defined in the "example-psp" PSP.

  3. Create a Pod that adheres to the PSP with the following YAML definition:

     apiVersion: v1
     kind: Pod
     metadata:
       name: example-pod
     spec:
       containers:
       - name: example-container
         image: nginx
     serviceAccountName: default
    

    This YAML file creates a new Pod named "example-pod" with a single container named "example-container" running the "nginx" image. The Pod specifies that it should use the "default" service account, which is bound to the "example-psp" PSP. Therefore, the Pod must adhere to the policies defined in the PSP.

    In this example, the "example-container" container does not require any special privileges, so it should adhere to the PSP's policies. If the container attempted to run with privileged mode or escalate privileges, for example, the Pod would fail to launch.

TLS Encryption

  1. Generate a self-signed certificate and private key with the following command:

     openssl req -x509 -newkey rsa:4096 -keyout tls.key -out tls.crt -days 365 -nodes
    

    This generates a self-signed certificate and private key with a 4096-bit RSA key, valid for 365 days, and stores them in files named "tls.key" and "tls.crt", respectively.

  2. Create a new Kubernetes Secret object to store the certificate and private key with the following command:

     kubectl create secret tls tls-secret --key tls.key --cert tls.crt
    

    This creates a new Secret object named "tls-secret" and stores the "tls.key" and "tls.crt" files as the "tls.key" and "tls.crt" keys in the Secret object, respectively.

  3. Modify the Kubernetes YAML file for your Deployment or Service to use the TLS certificate and private key. For example:

     apiVersion: apps/v1
     kind: Deployment
     metadata:
       name: example-deployment
     spec:
       replicas: 1
       selector:
         matchLabels:
           app: example
       template:
         metadata:
           labels:
             app: example
         spec:
           containers:
           - name: example-container
             image: myimage
             ports:
             - containerPort: 80
               protocol: TCP
             - containerPort: 443
               protocol: TCP
             volumeMounts:
             - name: tls-certs
               mountPath: /etc/tls
               readOnly: true
           volumes:
           - name: tls-certs
             secret:
               secretName: tls-secret
    

    This YAML file creates a new Deployment object named "example-deployment" with a single container named "example-container" running the "myimage" image. The container exposes two ports: port 80 for HTTP traffic and port 443 for HTTPS traffic. The YAML file also specifies a volumeMount named "tls-certs" that mounts the "tls-secret" Secret object at "/etc/tls" inside the container.

    The container can then access the TLS certificate and private key by reading the "/etc/tls/tls.crt" and "/etc/tls/tls.key" files, respectively. In this example, you would need to configure your web server running inside the container to use the TLS certificate and private key.

    Note that this example uses a self-signed certificate, which is not recommended for production use. In a production environment, you would typically obtain a certificate from a trusted Certificate Authority (CA) and use that certificate instead.

Thanks for reading!

0
Subscribe to my newsletter

Read articles from Aditya Murali directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Aditya Murali
Aditya Murali

Dedicated and hardworking DevOps aspirant who is eager to contribute skills and experience to help organizations solve real world use cases.