Kubernetes - Volumes

First we will try to understand few point about Container Volumes:

How a Docker Container manages data?

  1. Docker Writable layer Data Storage : (Non-persistent)

    Each Docker container has a writable layer where the application’s data is stored. This layer is part of the container’s file system and resides within the container's storage area (managed by Docker on the host), but this data is not persistent. If the container is deleted or crashes, all the data in the writable layer is lost.

  2. Docker-managed Volume (e.g., /var/lib/docker/volumes/) : (Persistent)

    A volume that Docker creates and manages itself and data will be stored in a default Docker directory (/var/lib/docker/volumes/).

     docker volume create my_volume
     docker run -v my_volume:/var/lib/mysql mysql
    
     Here, Docker will create and manage the my_volume under 
     /var/lib/docker/volumes/, and the container will see it as /var/lib/mysql.
    
  3. Bind Mount (e.g., /data/mysql on the host) : (Persistent)

    Directly maps a specific directory or file from the host machine into the container.

    The user explicitly provides a host directory (e.g., /data/mysql).

     docker run -v /data/mysql:/var/lib/mysql mysql
    
     In this case, you are directly mounting /data/mysql from the host into 
     the container’s /var/lib/mysql directory. The container will store MySQL 
     data directly in /data/mysql
    
  4. tmpfs Mount: (Non-persistent)

    Stores data in memory (RAM) instead of disk, making it temporary (data is lost when the container stops). Only resides in the system's memory, not on disk.

     docker run --tmpfs /container/path my_image
    
     Useful for cases where you need high-speed storage and don’t need 
     to persist the data, like caching.
    
  5. External Volumes (e.g., AWS EBS): (Persistent)

    • External volumes refer to storage provided by an external source (e.g., AWS EBS, Azure Disk).They can be attached to Docker containers to store data outside the host machine.

    • These volumes are often remote storage solutions. They are not tied to the Docker host itself. Useful in distributed and cloud environments.

    • How to Use with Docker:

      • Docker Volume Plugin: To use an external volume like AWS EBS, you typically use a volume plugin that allows Docker to communicate with the external storage service.

      • For example, for AWS, Docker supports the rexray/ebs plugin, which allows Docker to manage AWS EBS volumes as Docker volumes.

Example using AWS EBS with Docker:

  •     docker volume create \
          --driver rexray/ebs \
          --name my-ebs-volume \
          -o size=20
    

    Then, you can mount it in a container:

  •     docker run -v my-ebs-volume:/data my_image
    

Types of External Volumes:

  • Cloud Block Storage:

    • AWS EBS (Elastic Block Store), Azure Disk, Google Persistent Disk
  • Network-based Volumes:

    • NFS (Network File System), GlusterFS, Ceph, Amazon EFS (Elastic File System)

Options while creating Docker volumes:

  1. -v (or --volume) Option: (Limited Control)

    It is simpler and great for quick and straightforward mounts. The -v option doesn't provide explicit control over additional options like read/write mode or mount types.

     docker run -v [source]:[destination] [image] (or)
     docker run --volume [source]:[destination] [image]
    
     Named Volume: docker run -v my_volume:/data my_image
     Bind Mount: docker run -v /host/path:/container/path my_image
    
  2. --mount Option: (Explicit Control)

    The --mount option was introduced to provide a more flexible and clear syntax, especially when dealing with complex scenarios like specifying bind mounts, volumes, and tmpfs mounts.

     docker run --mount type=[type],source=[source],target=[destination],[options] [image]
    
     Named Volume:
     docker run --mount type=volume,source=my_volume,target=/data my_image
     This bind mounts /host/path to /container/path as read-only inside 
     the container.
    
     Bind Mount:
     docker run --mount type=bind,source=/host/path,target=/container/path my_image
     This does the same thing but is more readable, making it clearer that the mount 
     is read-only.
    
     Mounting tmpfs using --mount:
     docker run --mount type=tmpfs,target=/container/tmpfs my_image
     This mounts a temporary filesystem (tmpfs) inside the container, 
     which will be stored in memory.
    

Now, we will dive deep into volumes in Kubernetes:

How a Kubernetes Pod Manages Data:

Like Docker containers, Kubernetes Pods are ephemeral, and any data stored in a container's filesystem is lost if the Pod is rescheduled or restarted. Kubernetes offers several volume types to persist data across container restarts or Pod rescheduling.

Pod Lifecycle Data Storage:

Kubernetes Pods have two types of data storage methods, persistent and non-persistent, depending on the type of volume attached.

  1. Kubernetes Pod Writable Layer Data Storage: (Non-persistent)

    Each container within a Pod has its own ephemeral storage, where data is written to a writable layer within the container’s filesystem. This storage is non-persistent, and if the Pod is rescheduled or restarted, the data in the container’s writable layer is lost.

  2. emptyDir: (Non-persistent)

    A temporary directory that lives as long as the Pod lives. It starts empty and can be used by containers in the Pod to share data.

    • Usage: As soon as the Pod is deleted or restarted, the data inside the emptyDir is lost.

    • Use Case: Temporary scratch space, such as for caches, intermediate results, or temporary storage.

    Note:
    A container crashing does not remove a Pod from a node. 
    The data in an emptyDir volume is safe across container crashes.
    volumes:
    - name: cache-volume
      emptyDir: {}
  1. PersistentVolumeClaim (PVC): (Persistent)

    The most common method to persist data in Kubernetes. A PVC is a request for storage from a PersistentVolume (PV), which can be backed by external storage (NFS, cloud disks like AWS EBS, etc.).

    • Usage: PVCs allow long-term storage that persists beyond Pod restarts or deletions.

    • Use Case: Databases, application logs, or data that needs to persist beyond Pod lifetimes.

    volumes:
    - name: persistent-storage
      persistentVolumeClaim:
        claimName: my-pvc

Here, the PVC (my-pvc) is mounted to the container as a persistent storage volume at the specified location.

  1. hostPath: (Persistent, but host-bound):

    Mounts a directory or file from the Kubernetes node's filesystem into the Pod.

    • Usage: Allows a Pod to access files or directories on the node where the Pod is running. However, if the Pod is rescheduled to a different node, the data does not follow.

    • Use Case: Accessing the node’s specific files, like system logs, or when needing to use node-specific resources (e.g., hardware devices).

    • Example: Accessing host logs or Docker socket.

    volumes:
    - name: host-volume
      hostPath:
        path: /data

Here, the directory /data on the Kubernetes node is mounted into the Pod as hostPath.

  1. configMap and secret: (Persistent, for configurations):

    • configMap:

      • Purpose: Injects configuration data into containers as key-value pairs or files.

      • Use Case: Store configuration settings, environment variables, or configuration files that can be updated without recreating Pods.

      • Example: Config files or secrets like passwords and API keys.

    • secret:

      • Purpose: Injects sensitive data (e.g., passwords, tokens, certificates) securely into containers.

      • Use Case: Store sensitive information securely in a Pod.

      • Example: Unified configuration and credential management.

ConfigMap Example:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: app-config
    data:
      DATABASE_URL: "jdbc:mysql://mysql-service:3306/petclinic"
      APP_MODE: "production"
    Accessing a ConfigMap in a Pod:
    apiVersion: v1
    kind: Pod
    metadata:
      name: app-pod
    spec:
      containers:
        - name: app-container
          image: my-app:latest
          env:
            - name: DATABASE_URL
              valueFrom:
                configMapKeyRef:
                  name: app-config
                  key: DATABASE_URL

Secret Example:

    apiVersion: v1
    kind: Secret
    metadata:
      name: db-secret
    type: Opaque
    data:
      username: dXNlcm5hbWU=  # base64 encoded "username"
      password: cGFzc3dvcmQ=  # base64 encoded "password"
    Accessing a Secret in a Pod:
    apiVersion: v1
    kind: Pod
    metadata:
      name: db-pod
    spec:
      containers:
        - name: db-container
          image: mysql:latest
          env:
            - name: DB_USERNAME
              valueFrom:
                secretKeyRef:
                  name: db-secret
                  key: username
            - name: DB_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: db-secret
                  key: password
  1. NFS (Network File System): (Persistent):

    NFS (Network File System) is a distributed file system protocol that allows you to share files and directories over a network.

    Key Features of NFS:

    • File Sharing: Allows multiple clients to read and write to the same files over a network.

    • Persistence: Data remains available even if the pods using it are terminated, making it ideal for applications that require persistent storage.

Example of Using NFS in Kubernetes

  1. Set Up an NFS Server: You need to have an NFS server running, which will share the directory over the network.

    Example: You might create an NFS server on a VM or physical server, exporting a directory:

     # On the NFS server
     mkdir -p /srv/nfs/kubedata
     echo "/srv/nfs/kubedata *(rw,sync,no_root_squash)" >> /etc/exports
     exportfs -a
    
  2. Create a Persistent Volume (PV): In Kubernetes, define a Persistent Volume that points to the NFS share.

Persistent Volume Definition:

    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: nfs-pv
    spec:
      capacity:
        storage: 10Gi
      accessModes:
        - ReadWriteMany  # Allows multiple pods to read/write
      nfs:
        path: /srv/nfs/kubedata
        server: <NFS_SERVER_IP>  # Replace with the NFS server's IP
  1. Create a Persistent Volume Claim (PVC): A PVC requests storage from the PV.

    Persistent Volume Claim Definition:

    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: nfs-pvc
    spec:
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 10Gi
  1. Use the PVC in a Deployment: You can now mount the PVC in a pod to access the shared storage.

Deployment Example:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: my-app
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: my-app
      template:
        metadata:
          labels:
            app: my-app
        spec:
          containers:
            - name: app-container
              image: my-app-image
              volumeMounts:
                - mountPath: /mnt/data  # Path inside the container
                  name: nfs-storage
          volumes:
            - name: nfs-storage
              persistentVolumeClaim:
                claimName: nfs-pvc

Key Points:

  • NFS allows you to share files across multiple pods in a Kubernetes cluster.

  • Persistent Volumes (PV) and Persistent Volume Claims (PVC) enable the use of NFS storage, ensuring data persistence and accessibility.

  • This setup is ideal for applications requiring shared access to files, such as content management systems or data processing applications.

0
Subscribe to my newsletter

Read articles from Subbu Tech Tutorials directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Subbu Tech Tutorials
Subbu Tech Tutorials