Kubernetes Volume Explained : Day 29 of 40daysofkubernetes
Introduction
In our previous blog, we delved into Docker Volumes. Now, it's time to explore Kubernetes Volumes.
In a Kubernetes environment, managing data persistence is crucial, especially for stateful applications. By default, the lifecycle of a Pod and its storage are tightly coupled, meaning that when a Pod is deleted, the associated data is lost. This is where Kubernetes volumes come into play. Kubernetes volumes provide a way to maintain data across Pod restarts, ensuring that your critical data remains available even when the Pods managing the application are replaced or rescheduled.
Let's dive into this with some examples.
Let's dive into this with examples.
Example Without Persistent Volume
apiVersion: v1
kind: Pod
metadata:
name: redis-pod
spec:
containers:
- name: redis
image: redis
volumeMounts:
- name: redis-storage
mountPath: /data/redis
volumes:
- name: redis-storage
emptyDir: {}
Explanation:
volumeMounts: This section defines where and how the volumes should be mounted into the container.
name: redis-storage
: Refers to a volume namedredis-storage
(defined in thevolumes
section below).mountPath: /data/redis
: Specifies the directory inside the container where the volume will be mounted. In this case, the volume will be mounted at/data/redis
inside the Redis container.
volumes: This section defines the volumes that are available to be mounted by containers in the Pod.
name: redis-storage
: This is the name of the volume, which is referenced by thevolumeMounts
in the container specification.emptyDir: {}
: Specifies the type of volume.emptyDir
is a type of volume that creates an empty directory stored on the node's filesystem where the Pod is running. TheemptyDir
volume starts empty when the Pod is created, and data written to this volume persists across container restarts within the same Pod. However, the data in anemptyDir
is permanently deleted if the Pod is removed from the node.
This configuration sets up a Pod named redis-pod
running a Redis container. The container uses an emptyDir
volume to store data at /data/redis
within the container. The emptyDir
volume is temporary storage that exists for the lifetime of the Pod on the node. It is useful for cases where you need temporary storage for data that does not need to persist beyond the Pod's lifecycle. However, if the Pod is deleted, the data in this emptyDir
volume will also be deleted, making it unsuitable for storing data that needs to persist after the Pod's lifecycle ends.
Try It Out:
- Apply this YAML and exec into the Pod
kubectl apply -f redis.yaml
kubectl exec -it redis-pod -- sh
- Go to the
/data/redis
directory and create a file namedtest.txt
.
- Now, delete the Pod and recreate it. Check the
/data/redis
directory again, and you will find that thetest.txt
file is gone.
Introducing Persistent Volumes
Kubernetes Volumes are an essential component for managing storage within a cluster. Let’s break down three key concepts: Persistent Volume (PV), Persistent Volume Claim (PVC), and StorageClass.
Persistent Volume (PV)
A Persistent Volume is a piece of storage in the cluster that has been provisioned by an administrator or dynamically through a StorageClass. It is independent of any individual Pod and exists beyond the lifecycle of a Pod. PVs are cluster resources, just like nodes are cluster resources.
Characteristics:
A PV is a resource in the cluster that encapsulates the storage details like capacity, access modes, and reclaim policy.
PVs are usually backed by physical storage, such as cloud storage, networked storage systems, or even local disks.
PVs are created and managed by the administrator or automatically using a StorageClass.
Persistent Volume Claim (PVC)
A Persistent Volume Claim is a request for storage by a user. It is similar to a Pod requesting compute resources like CPU and memory. A PVC specifies the amount of storage, access modes, and an optional StorageClass.
Characteristics:
PVCs are requests for storage, similar to how Pods request CPU or memory.
When a PVC is created, Kubernetes looks for a PV that matches the claim and binds the PVC to that PV.
Once bound, the PVC can be used by Pods to access the storage.
Example on PV and PVC
We are using a control-plane node that we created on AWS. To configure it, you can refer to this blog.
On the control plane, when you run
kubectl get nodes
, the output should be as expected.Create a folder named
day29
and create a fileindex.html
using:nano index.html
Add some content, like "Hello Kubernetes."
Now create
pv.yaml
using your favorite editor, like nano or vi, and paste the following content:apiVersion: v1 kind: PersistentVolume metadata: name: task-pv-volume labels: type: local spec: storageClassName: standard capacity: storage: 1Gi accessModes: - ReadWriteOnce hostPath: path: "/home/ubuntu/day29"
This
PersistentVolume
(task-pv-volume
) provides 1 GiB of storage that is located on the node's local filesystem at the path/home/ubuntu/day29
. The volume is labeled astype: local
, uses thestandard
storage class, and can be accessed inReadWriteOnce
mode, meaning it can be mounted as read-write by a single node at a time. This PV can be claimed by aPersistentVolumeClaim
(PVC) that requests storage with similar characteristics, allowing Pods to use the storage in a consistent and reliable manner.Create
pvc.yaml
with the following content:apiVersion: v1 kind: PersistentVolumeClaim metadata: name: task-pv-claim spec: storageClassName: standard accessModes: - ReadWriteOnce resources: requests: storage: 500Mi
This PersistentVolumeClaim (
task-pv-claim
) is requesting 500 MiB of storage with theReadWriteOnce
access mode, meaning it can be mounted as read-write by a single node. It also specifies that it should use thestandard
StorageClass, which determines how the underlying PersistentVolume is provisioned.When this PVC is created, Kubernetes will try to find a matching PersistentVolume (PV) that meets these criteria—specifically, a PV with at least 500 MiB of storage, the
ReadWriteOnce
access mode, and thestandard
StorageClass. Once a suitable PV is found, it will be bound to this PVC, allowing Pods to use the storage for their data needs.Finally, create
pod.yaml
:apiVersion: v1 kind: Pod metadata: name: task-pv-pod spec: volumes: - name: task-pv-storage persistentVolumeClaim: claimName: task-pv-claim containers: - name: task-pv-container image: nginx ports: - containerPort: 80 name: "http-server" volumeMounts: - mountPath: "/usr/share/nginx/html" name: task-pv-storage
This Pod configuration (
task-pv-pod
) creates a single container running the NGINX web server. The Pod uses a PersistentVolumeClaim (task-pv-claim
) to request storage, which is then mounted into the container at/usr/share/nginx/html
. This setup allows NGINX to serve files from the persistent storage provided by the underlying PersistentVolume (PV) linked to the PVC.In essence, this configuration ensures that any content placed in the mounted directory will persist even if the Pod is deleted and recreated, as long as the PVC remains bound to the PV. This is particularly useful for web applications where you need to serve persistent content.
When we exec into the Pod and navigate to the /usr/share/nginx/html
directory, and see all the files from a folder named day29
, it means that the contents of the day29
folder were already present in the PersistentVolume bound to the PVC task-pv-claim
.
The data in the day29
folder persists independently of the Pod lifecycle. This means if the Pod is deleted and recreated, the files in the day29
folder will still be available as long as the PV is intact and still bound to the PVC.
StorageClass
A StorageClass provides a way for administrators to describe the "classes" of storage they offer. Different classes might map to quality-of-service levels, backup policies, or other characteristics.
Characteristics:
StorageClasses define the provisioner (the plugin that provides storage), parameters, and reclaim policies that define what happens to a PV when it’s released by a PVC.
When a PVC requests storage with a specific StorageClass, the StorageClass dynamically provisions a PV that matches the requirements of the PVC.
How They Work Together
StorageClass: Defines how storage is provisioned (e.g., AWS EBS, GCE PD).
PersistentVolume: Represents a storage resource in the cluster, often created dynamically based on a StorageClass.
PersistentVolumeClaim: A user’s request for storage, specifying size, access mode, and optionally, the StorageClass.
When a user creates a PVC, Kubernetes automatically finds a matching PV (or creates one based on the specified StorageClass) and binds them together. The Pod then uses the PVC to access the storage, ensuring data persistence across restarts and rescheduling of the Pod.
Resources I used
Subscribe to my newsletter
Read articles from Shivam Gautam directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Shivam Gautam
Shivam Gautam
DevOps & AWS Learner | Sharing my insights and progress 📚💡|| 1X AWS Certified || AWS CLoud Club Captain