Understanding Ephemeral Storage in Kubernetes
Table of contents
Ephemeral storage in Kubernetes refers to the storage associated with a pod that exists only for the lifetime of that pod. It is typically used for temporary data that does not need to persist beyond the pod's lifecycle. Ephemeral storage is usually backed by the node’s local storage and is not designed for long-term data retention. It gets wiped clean once the pod is deleted or restarted.
Usage of Ephemeral Storage
Ephemeral storage in Kubernetes can be used in various scenarios, such as:
Scratch space for applications:
- Applications that require temporary space for intermediate data processing.
Caching:
- Temporary storage for cached data that can be regenerated if lost.
Logs:
- Storing log data that can be offloaded to persistent storage or log aggregation systems.
Temporary storage for build artifacts:
- During CI/CD processes, ephemeral storage can be used to store build artifacts that are needed only temporarily.
Types of Ephemeral Storage in Kubernetes
-
- A volume that is created when a pod is assigned to a node and exists as long as the pod is running. Empty at Pod startup, with storage coming locally from the kubelet base directory (usually the root disk
/var/lib/kubelet
) or RAM
- A volume that is created when a pod is assigned to a node and exists as long as the pod is running. Empty at Pod startup, with storage coming locally from the kubelet base directory (usually the root disk
-
- Used for storing configuration data and sensitive information that can be consumed by pods.
-
- Allows pods to consume metadata about themselves or their environment.
-
- Similar to the previous volume kinds, but provided by special CSI drivers which specifically support this feature.
-
- Can be provided by all storage drivers that also support persistent volumes
emptyDir
, configMap
,downwardAPI
, secret
are provided as local ephemeral storage. They are managed by kubelet on each node.
Default Ephemeral Storage Type
When a Pod is started in Kubernetes, the default type of ephemeral storage used is the node’s local storage associated with the pod's lifecycle. Specifically, the primary forms of ephemeral storage that are used by default without any explicit configuration are:
Container Writable Layer
Each container in a pod has a writable layer provided by the container runtime (e.g., Docker, containerd). This writable layer is ephemeral and used by default for any file operations within the container that are not mapped to other volumes.
Automatic Ephemeral Storage Features
Container Writable Layer:
- Each container has its own writable layer where it can write files. This writable layer is ephemeral and lasts only for the duration of the container. Once the container is terminated or restarted, the data in this writable layer is lost.
EmptyDir (if specified):
- If an
emptyDir
volume is explicitly specified in the pod's configuration, it provides a dedicated ephemeral storage that is shared among all containers in the pod. This storage is tied to the pod's lifecycle and is created when the pod is scheduled and deleted when the pod is terminated.
- If an
Here's an example of pod definition with emptyDir
volume specified:
apiVersion: v1
kind: Pod
metadata:
name: example-pod
spec:
containers:
- name: example-container
image: busybox
command: ["sh", "-c", "echo Hello Kubernetes! > /data/hello.txt && sleep 3600"]
volumeMounts:
- mountPath: /data
name: temp-storage
volumes:
- name: temp-storage
emptyDir: {}
Example
If you start a pod without specifying any volumes, the writable layer of each container acts as the ephemeral storage by default. Here is an example of a simple pod configuration without any explicit ephemeral storage:
apiVersion: v1
kind: Pod
metadata:
name: example-pod
spec:
containers:
- name: example-container
image: busybox
command: ["sh", "-c", "echo Hello Kubernetes! > /tmp/hello.txt && sleep 3600"]
In this example:
The container uses its writable layer to write the file
/tmp/hello.txt
.This storage is ephemeral and will be lost if the container is terminated or restarted.
To see the writable layer of a container in Kubernetes, you typically need to access the container's filesystem. The writable layer is where all changes made to the filesystem of the container (such as creating or modifying files) are stored. Here's how you can inspect the writable layer:
Exec into the container:
kubectl exec -it example-pod -c example-container -- /bin/sh
Inspect filesystem. You should see hello.txt
file:
cd /tmp
ls -l
By default, when a pod is started in Kubernetes, the container’s writable layer is used as the ephemeral storage. If you need shared ephemeral storage among containers in the same pod, you can explicitly define an emptyDir
volume. The writable layer is tied to the container lifecycle, while emptyDir
(if used) is tied to the pod lifecycle.
Best Practices
Resource Requests and Limits
Set requests and limits for ephemeral storage to ensure that no single pod can consume all the node's storage. This helps in managing resource allocation and preventing resource exhaustion.
In the following example, the Pod has two containers. Each container has a request of 2GiB of local ephemeral storage. Each container has a limit of 4GiB of local ephemeral storage. Therefore, the Pod has a request of 4GiB of local ephemeral storage, and a limit of 8GiB of local ephemeral storage. 500Mi of that limit could be consumed by the emptyDir
volume.
apiVersion: v1
kind: Pod
metadata:
name: frontend
spec:
containers:
- name: app
image: images.my-company.example/app:v4
resources:
requests:
ephemeral-storage: "2Gi"
limits:
ephemeral-storage: "4Gi"
volumeMounts:
- name: ephemeral
mountPath: "/tmp"
- name: log-aggregator
image: images.my-company.example/log-aggregator:v6
resources:
requests:
ephemeral-storage: "2Gi"
limits:
ephemeral-storage: "4Gi"
volumeMounts:
- name: ephemeral
mountPath: "/tmp"
volumes:
- name: ephemeral
emptyDir:
sizeLimit: 500Mi
The sizeLimit
field under emptyDir
specifies the maximum size for the entire emptyDir volume. An emptyDir
volume is a temporary directory that initially starts empty and is deleted when the Pod is removed.
In given example, the sizeLimit
of 500Mi restricts the total size of the ephemeral
volume that is mounted to /tmp
in both containers.
Each container (app
and log-aggregator
) is allowed to request 2Gi and use up to 4Gi of ephemeral storage individually.
However, the shared ephemeral
volume (emptyDir
) has a sizeLimit
of 500Mi, meaning the combined storage usage of both containers in /tmp
cannot exceed 500Mi.
Log Management
Use log aggregation solutions (e.g., ELK Stack, Fluentd) to collect and store logs from pods to avoid losing critical log data when pods are deleted.
Monitoring and Alerts:
Implement monitoring for ephemeral storage usage. Tools like Prometheus and Grafana can be used to set up alerts for storage thresholds.
If the kubelet is managing local ephemeral storage as a resource, then the kubelet measures storage use in:
emptyDir
volumes, except tmpfsemptyDir
volumesdirectories holding node-level logs
writeable container layers
If a Pod is using more ephemeral storage than you allow it to, the kubelet sets an eviction signal that triggers Pod eviction.
For container-level isolation, if a container's writable layer and log usage exceeds its storage limit, the kubelet marks the Pod for eviction.
For pod-level isolation the kubelet works out an overall Pod storage limit by summing the limits for the containers in that Pod. In this case, if the sum of the local ephemeral storage usage from all containers and also the Pod's emptyDir
volumes exceeds the overall Pod storage limit, then the kubelet also marks the Pod for eviction.
By default ephemeral container data is located at:
/var/lib/kubelet
/var/lib/containers
on the Kubernetes Node.
To show the ephemeral storage usage on the node use:
df -h /var/lib
To show the ephemeral storage usage inside the container:
du -h .
du -h [directory]
Data Backup
For critical temporary data, implement mechanisms to periodically back up data to persistent storage solutions.
Clean Up
Ensure that ephemeral storage is properly cleaned up when pods are terminated to avoid orphaned data and wasted storage space. Even terminated and failed pods take up the space. Therefore, it is important to clean up the cluster from hanging pods.
You can use the following Bash script in your CI/CD pipelines or clean your cluster manually by starting the script from your local machine: https://github.com/Brain2life/bash-cookbook/tree/k8s-cleanup-pods
Type of Issues with Ephemeral Storage
Data Loss:
- Since ephemeral storage is tied to the pod lifecycle, any data stored there will be lost if the pod is deleted or crashes.
Resource Contention:
- Without proper resource limits, pods might consume more storage than expected, leading to node resource contention and potential disruptions.
Node Disk Pressure:
- High usage of ephemeral storage can cause disk pressure on nodes, triggering eviction of pods or other resource management actions by the kubelet.
Limited Capacity:
- Nodes have finite storage capacity, and excessive usage of ephemeral storage can exhaust available space, affecting the overall cluster performance.
No Persistence:
- Ephemeral storage is not suitable for storing data that needs to be preserved across pod restarts or crashes. Applications requiring persistent storage should use Persistent Volumes (PVs) and Persistent Volume Claims (PVCs).
Understanding and effectively managing ephemeral storage in Kubernetes is crucial for ensuring the stability and performance of applications running in the cluster.
Subscribe to my newsletter
Read articles from Maxat Akbanov directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Maxat Akbanov
Maxat Akbanov
Hey, I'm a postgraduate in Cyber Security with practical experience in Software Engineering and DevOps Operations. The top player on TryHackMe platform, multilingual speaker (Kazakh, Russian, English, Spanish, and Turkish), curios person, bookworm, geek, sports lover, and just a good guy to speak with!