A common question that arises when exploring Kubernetes is how to manage files and data storage when containers are constantly created and destroyed, sometimes on different machines. Kubernetes introduced Persistent Volumes (PV) and Persistent Volume Claims (PVC) to handle this challenge, offering an abstraction layer that decouples storage from pods. Let’s dive into why these objects were necessary and how they work together to provide stable, scalable storage.

The Problem with Basic Volumes in Kubernetes

The original storage concept in Kubernetes, called Volumes, allowed users to attach storage directly to a Pod. While useful for many applications, Volumes have some limitations:

Binding to Pod Lifecycle: Volumes are tightly bound to the lifecycle of the Pod. When a Pod is deleted or recreated, the data stored in its Volume may be lost, making it unsuitable for data that needs persistence.
Replicated Containers: When using replicated containers (via ReplicaSets), each replica typically requires its own storage. However, with standard Volumes, all replicas share the same storage, which may not be appropriate for every use case.
Environment-Specific Bindings: Defining a specific volume type in a Pod (e.g., an Azure Disk) makes the Pod’s definition cloud provider-specific. This conflicts with Kubernetes’ goal of being a platform-independent orchestration tool.

The Solution: Persistent Volumes and Persistent Volume Claims

To overcome these limitations, Kubernetes introduced Persistent Volumes (PVs) and Persistent Volume Claims (PVCs). These objects create an abstraction layer that allows storage to be decoupled from Pods, offering greater flexibility, scalability, and portability.

Persistent Volumes (PVs): A PV is a standalone Kubernetes object that represents a storage resource. This resource could be a cloud-based storage disk, an NFS server, or any other supported storage system. By creating PVs as independent resources, Kubernetes can now offer storage that persists beyond the Pod’s lifecycle.
Persistent Volume Claims (PVCs): PVCs are requests for storage. They allow Pods to claim storage from a PV without directly specifying the type or source of the storage. This allows Pods to remain agnostic to the storage backend, enhancing portability. A PVC also ensures each Pod can have its own unique storage, suitable for applications that require dedicated data storage.
Flexible Binding: With PVs and PVCs, Kubernetes administrators can create pools of storage resources, while application developers can request storage without needing to know the specific infrastructure details. This separation promotes flexibility, as storage can be dynamically provisioned based on requirements.

Benefits of PVs and PVCs in Kubernetes

The PV-PVC model offers multiple advantages:

Data Persistence: PVs ensure that data remains available even if the Pod accessing it is deleted or recreated.
Portability: PVCs allow Pods to request storage resources without specifying a particular storage provider, making applications more portable.
Scalability: Each Pod in a replicated setup can use a unique PV, supporting applications that need distinct storage for each instance.

Conclusion

Persistent Volumes and Persistent Volume Claims address the limitations of Kubernetes Volumes, making storage more resilient, flexible, and portable across different environments. With PVs and PVCs, Kubernetes users can confidently deploy applications that require persistent storage, even in dynamic, scalable environments.

Certainly! Let’s dive deeper with some practical examples to illustrate why Persistent Volumes (PVs) and Persistent Volume Claims (PVCs) are beneficial in Kubernetes.

Example 1: A Basic Volume Bound to a Pod

Imagine you have a Pod running a database, and you want this database to store data in a specific location on disk. You can create a Volume within the Pod configuration and mount it directly to the container:

apiVersion: v1
kind: Pod
metadata:
  name: example-pod
spec:
  containers:
    - name: db-container
      image: mysql
      volumeMounts:
        - mountPath: /var/lib/mysql
          name: db-storage
  volumes:
    - name: db-storage
      hostPath:
        path: /data/mysql

Here, the Volume is set up with hostPath, which binds it to a specific path on the host’s filesystem. This approach has limitations:

If this Pod is deleted, so is the Volume, potentially losing all the stored data.
If we scale this database to multiple replicas, each replica would try to write to the same hostPath, which might not be suitable or safe for concurrent access.
The hostPath volume type ties this Pod to the specific host it’s scheduled on, which could reduce portability and scalability.

Example 2: Persistent Volume and Persistent Volume Claim

To address these limitations, Kubernetes introduces Persistent Volumes (PVs) and Persistent Volume Claims (PVCs). Let’s see how these solve the issues above.

Persistent Volume (PV): We’ll create a PV representing an independent storage resource, like a networked storage disk that persists even if the Pod using it is removed.
Persistent Volume Claim (PVC): We’ll create a PVC that a Pod can use to request storage without knowing the specifics of the storage type or backend.

Step 1: Define the Persistent Volume (PV)

First, we define a PV. Here’s an example using a PersistentVolume with NFS storage:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: example-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  nfs:
    path: /path/to/storage
    server: nfs-server.example.com

This PersistentVolume object represents a 10GB NFS storage resource, which is now part of the Kubernetes cluster's storage pool.

Step 2: Define the Persistent Volume Claim (PVC)

Now, let’s create a PVC that requests this storage without directly referencing the NFS configuration. The Pod requesting storage only needs to know it requires 10GB of storage and specific access modes, not the storage backend details:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: example-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

Here, the PersistentVolumeClaim (PVC) requests a volume of 10GB with the ReadWriteOnce access mode, meaning it can be mounted by a single node. Kubernetes then automatically binds this PVC to the available PV that meets these criteria.

Step 3: Using the PVC in a Pod

Finally, we can create a Pod that uses this PVC to claim storage without worrying about the underlying storage configuration:

apiVersion: v1
kind: Pod
metadata:
  name: example-pod
spec:
  containers:
    - name: db-container
      image: mysql
      volumeMounts:
        - mountPath: /var/lib/mysql
          name: db-storage
  volumes:
    - name: db-storage
      persistentVolumeClaim:
        claimName: example-pvc

In this example, the Pod mounts the PVC example-pvc at /var/lib/mysql. Kubernetes automatically binds this Pod to the PersistentVolume we created, providing storage without requiring specific infrastructure details in the Pod definition.

Benefits Highlighted by This Example

Data Persistence: The data in /var/lib/mysql is stored in a Persistent Volume. Even if the Pod is deleted, the storage (PV) remains, ensuring data durability.
Separation of Concerns: The Pod requests storage generically using a PVC. The storage details (e.g., NFS configuration) are only specified in the PV, enhancing portability and reuse.
Scalability: If we want multiple Pods with individual storage, each Pod can request its own PVC. Each PVC can then bind to separate PVs, allowing for unique storage per replica.

Why Persistent Volumes (PV) and Persistent Volume Claims (PVC) Exist in Kubernetes ?