Kubernetes Volume Explained : Day 29 of 40daysofkubernetes

Shivam GautamShivam Gautam
7 min read

Introduction

In our previous blog, we delved into Docker Volumes. Now, it's time to explore Kubernetes Volumes.

In a Kubernetes environment, managing data persistence is crucial, especially for stateful applications. By default, the lifecycle of a Pod and its storage are tightly coupled, meaning that when a Pod is deleted, the associated data is lost. This is where Kubernetes volumes come into play. Kubernetes volumes provide a way to maintain data across Pod restarts, ensuring that your critical data remains available even when the Pods managing the application are replaced or rescheduled.

Let's dive into this with some examples.

Let's dive into this with examples.

Example Without Persistent Volume

apiVersion: v1
kind: Pod
metadata:
  name: redis-pod
spec:
  containers:
    - name: redis
      image: redis
      volumeMounts:
        - name: redis-storage
          mountPath: /data/redis
  volumes:
    - name: redis-storage
      emptyDir: {}

Explanation:

  • volumeMounts: This section defines where and how the volumes should be mounted into the container.

    • name: redis-storage: Refers to a volume named redis-storage (defined in the volumes section below).

    • mountPath: /data/redis: Specifies the directory inside the container where the volume will be mounted. In this case, the volume will be mounted at /data/redis inside the Redis container.

  • volumes: This section defines the volumes that are available to be mounted by containers in the Pod.

    • name: redis-storage: This is the name of the volume, which is referenced by the volumeMounts in the container specification.

    • emptyDir: {}: Specifies the type of volume. emptyDir is a type of volume that creates an empty directory stored on the node's filesystem where the Pod is running. The emptyDir volume starts empty when the Pod is created, and data written to this volume persists across container restarts within the same Pod. However, the data in an emptyDir is permanently deleted if the Pod is removed from the node.

This configuration sets up a Pod named redis-pod running a Redis container. The container uses an emptyDir volume to store data at /data/redis within the container. The emptyDir volume is temporary storage that exists for the lifetime of the Pod on the node. It is useful for cases where you need temporary storage for data that does not need to persist beyond the Pod's lifecycle. However, if the Pod is deleted, the data in this emptyDir volume will also be deleted, making it unsuitable for storing data that needs to persist after the Pod's lifecycle ends.

Try It Out:

  1. Apply this YAML and exec into the Pod
kubectl apply -f redis.yaml
kubectl exec -it redis-pod -- sh
  1. Go to the /data/redis directory and create a file named test.txt.

  1. Now, delete the Pod and recreate it. Check the /data/redis directory again, and you will find that the test.txt file is gone.

Introducing Persistent Volumes

Kubernetes Volumes are an essential component for managing storage within a cluster. Let’s break down three key concepts: Persistent Volume (PV), Persistent Volume Claim (PVC), and StorageClass.

Persistent Volume (PV)

A Persistent Volume is a piece of storage in the cluster that has been provisioned by an administrator or dynamically through a StorageClass. It is independent of any individual Pod and exists beyond the lifecycle of a Pod. PVs are cluster resources, just like nodes are cluster resources.

  • Characteristics:

    • A PV is a resource in the cluster that encapsulates the storage details like capacity, access modes, and reclaim policy.

    • PVs are usually backed by physical storage, such as cloud storage, networked storage systems, or even local disks.

    • PVs are created and managed by the administrator or automatically using a StorageClass.

Persistent Volume Claim (PVC)

A Persistent Volume Claim is a request for storage by a user. It is similar to a Pod requesting compute resources like CPU and memory. A PVC specifies the amount of storage, access modes, and an optional StorageClass.

  • Characteristics:

    • PVCs are requests for storage, similar to how Pods request CPU or memory.

    • When a PVC is created, Kubernetes looks for a PV that matches the claim and binds the PVC to that PV.

    • Once bound, the PVC can be used by Pods to access the storage.

Example on PV and PVC

We are using a control-plane node that we created on AWS. To configure it, you can refer to this blog.

  1. On the control plane, when you run kubectl get nodes, the output should be as expected.

  2. Create a folder named day29 and create a file index.html using:

     nano index.html
    

    Add some content, like "Hello Kubernetes."

  3. Now create pv.yaml using your favorite editor, like nano or vi, and paste the following content:

     apiVersion: v1
     kind: PersistentVolume
     metadata:
       name: task-pv-volume
       labels:
         type: local
     spec:
       storageClassName: standard
       capacity:
         storage: 1Gi
       accessModes:
         - ReadWriteOnce
       hostPath:
         path: "/home/ubuntu/day29"
    

    This PersistentVolume (task-pv-volume) provides 1 GiB of storage that is located on the node's local filesystem at the path /home/ubuntu/day29 . The volume is labeled as type: local, uses the standard storage class, and can be accessed in ReadWriteOnce mode, meaning it can be mounted as read-write by a single node at a time. This PV can be claimed by a PersistentVolumeClaim (PVC) that requests storage with similar characteristics, allowing Pods to use the storage in a consistent and reliable manner.

  4. Create pvc.yaml with the following content:

     apiVersion: v1
     kind: PersistentVolumeClaim
     metadata:
       name: task-pv-claim
     spec:
       storageClassName: standard
       accessModes:
         - ReadWriteOnce
       resources:
         requests:
           storage: 500Mi
    

    This PersistentVolumeClaim (task-pv-claim) is requesting 500 MiB of storage with the ReadWriteOnce access mode, meaning it can be mounted as read-write by a single node. It also specifies that it should use the standard StorageClass, which determines how the underlying PersistentVolume is provisioned.

    When this PVC is created, Kubernetes will try to find a matching PersistentVolume (PV) that meets these criteria—specifically, a PV with at least 500 MiB of storage, the ReadWriteOnce access mode, and the standard StorageClass. Once a suitable PV is found, it will be bound to this PVC, allowing Pods to use the storage for their data needs.

  5. Finally, create pod.yaml:

     apiVersion: v1
     kind: Pod
     metadata:
       name: task-pv-pod
     spec:
       volumes:
         - name: task-pv-storage
           persistentVolumeClaim:
             claimName: task-pv-claim
       containers:
         - name: task-pv-container
           image: nginx
           ports:
             - containerPort: 80
               name: "http-server"
           volumeMounts:
             - mountPath: "/usr/share/nginx/html"
               name: task-pv-storage
    

    This Pod configuration (task-pv-pod) creates a single container running the NGINX web server. The Pod uses a PersistentVolumeClaim (task-pv-claim) to request storage, which is then mounted into the container at /usr/share/nginx/html. This setup allows NGINX to serve files from the persistent storage provided by the underlying PersistentVolume (PV) linked to the PVC.

    In essence, this configuration ensures that any content placed in the mounted directory will persist even if the Pod is deleted and recreated, as long as the PVC remains bound to the PV. This is particularly useful for web applications where you need to serve persistent content.

When we exec into the Pod and navigate to the /usr/share/nginx/html directory, and see all the files from a folder named day29, it means that the contents of the day29 folder were already present in the PersistentVolume bound to the PVC task-pv-claim.

The data in the day29 folder persists independently of the Pod lifecycle. This means if the Pod is deleted and recreated, the files in the day29 folder will still be available as long as the PV is intact and still bound to the PVC.

StorageClass

A StorageClass provides a way for administrators to describe the "classes" of storage they offer. Different classes might map to quality-of-service levels, backup policies, or other characteristics.

  • Characteristics:

    • StorageClasses define the provisioner (the plugin that provides storage), parameters, and reclaim policies that define what happens to a PV when it’s released by a PVC.

    • When a PVC requests storage with a specific StorageClass, the StorageClass dynamically provisions a PV that matches the requirements of the PVC.

How They Work Together

  1. StorageClass: Defines how storage is provisioned (e.g., AWS EBS, GCE PD).

  2. PersistentVolume: Represents a storage resource in the cluster, often created dynamically based on a StorageClass.

  3. PersistentVolumeClaim: A user’s request for storage, specifying size, access mode, and optionally, the StorageClass.

When a user creates a PVC, Kubernetes automatically finds a matching PV (or creates one based on the specified StorageClass) and binds them together. The Pod then uses the PVC to access the storage, ensuring data persistence across restarts and rescheduling of the Pod.

Resources I used

0
Subscribe to my newsletter

Read articles from Shivam Gautam directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Shivam Gautam
Shivam Gautam

DevOps & AWS Learner | Sharing my insights and progress 📚💡|| 1X AWS Certified || AWS CLoud Club Captain