Persistent Storage in Docker: Volumes, Bind Mounts, and tmpfs
Introduction
As developers, we often run stateless containers in development or production, where containers can be destroyed and recreated without any loss of state. However, many real-world applications require stateful behavior—whether for databases, file storage, or logs—that persists across container restarts, re-deployments, or updates. This is where Docker volumes and persistent storage come into play.
In this blog, we’ll take a deep dive into Docker volumes, covering how they work, different types of storage options, when and how to use them, and advanced topics like volume management and data persistence in multi-container applications.
Why Persistent Storage?
Docker containers are ephemeral by nature. When a container is destroyed, its filesystem is also destroyed. This stateless behavior is excellent for immutable application deployments, but for many cases—like running a database, saving logs, or storing uploaded files—state persistence is essential.
Without persistent storage, every time your container is recreated, any data it wrote during its previous lifecycle is lost. For example:
If you're running a MySQL container, the database would be wiped out upon container termination.
If you're storing logs or user-uploaded content, those files would disappear when the container is stopped.
By leveraging persistent storage, we can ensure that our data persists beyond the lifecycle of individual containers.
Types of Persistent Storage in Docker
Docker provides three primary ways to manage persistent data:
Volumes: Managed by Docker and stored in a specific directory on the host.
Bind mounts: Link a directory on the host directly to a container.
tmpfs mounts: Store data in the container’s memory, and it is not written to the host.
1. Volumes
Volumes are the preferred method for persisting data in Docker. Docker manages volumes entirely, and they are decoupled from the host filesystem, making them portable and flexible.
Key Features of Docker Volumes:
Volumes are stored outside the container's writable layer, so they remain intact even if the container is deleted.
Docker takes care of the management and mounting of volumes, offering an abstraction over the host filesystem.
Volumes work on both Linux and Windows environments.
They can be shared across multiple containers.
Creating and Using Volumes
You can create and manage Docker volumes using the following commands:
Creating a Volume:
docker volume create myvolume
Listing Volumes:
docker volume ls
Inspecting a Volume:
docker volume inspect myvolume
Attaching a Volume to a Container: You can mount a volume when running a container using the
-v
or--mount
flag.docker run -d -v myvolume:/data my-container
In this example, the myvolume
volume is mounted inside the container at /data
. Any changes made to files in the /data
directory will persist across container restarts.
Volume Lifecycle Management
Volumes are persistent by default, even after the container using them is removed. However, if you want to delete a volume when it’s no longer needed, you can do so manually:
Removing a Volume:
docker volume rm myvolume
Removing all Unused Volumes:
docker volume prune
This will remove all volumes not currently in use by any container, freeing up disk space.
2. Bind Mounts
Bind mounts are an alternative to volumes, allowing you to mount a directory from the host filesystem directly into a container. This gives the container access to files on the host, making it especially useful in development environments when you want to share code or configuration files between the host and container.
Key Differences from Volumes:
Bind mounts have no abstraction; the path you specify is mounted directly into the container.
The directory must exist on the host before you mount it.
Bind mounts are less portable and flexible than volumes, as they are tied to specific paths on the host machine.
Creating a Bind Mount:
You can create a bind mount by specifying the full path on the host that should be mounted into the container:
docker run -d --mount type=bind,source=/path/on/host,target=/app my-container
In this example, the /path/on/host
directory is mounted into the container at /app
. Any changes made to files in /app
are directly reflected in /path/on/host
, and vice versa.
3. tmpfs Mounts
tmpfs mounts are used to store data in memory rather than on disk. This means that any data written to a tmpfs mount is lost when the container stops. tmpfs mounts are useful when you need temporary data storage that’s fast and doesn't need to persist after the container shuts down, such as for sensitive information or caching.
Using tmpfs Mounts:
docker run -d --mount type=tmpfs,tmpfs-size=64M,tmpfs-mode=1777,target=/cache my-container
In this example, a tmpfs mount of size 64 MB is created at /cache
, and the data written to it will be stored in the container’s memory.
Advanced Topics in Docker Volumes
1. Sharing Volumes Between Multiple Containers
In a typical microservices architecture, multiple containers may need to share access to the same data. Docker volumes provide an easy way to share files and directories between containers.
For example, let’s say you have two containers, one for generating data and one for consuming it. You can use a shared volume between them:
docker run -d --name producer -v sharedvolume:/data producer-container
docker run -d --name consumer -v sharedvolume:/data consumer-container
In this case, both the producer
and consumer
containers can access the data stored in /data
, allowing them to share state without needing to pass data over the network.
2. Volume Drivers
Volume drivers allow you to store data outside of the Docker host, such as in cloud storage solutions or network file systems. This is useful for highly available, distributed applications that require persistent storage across multiple hosts.
Some common volume drivers include:
local: The default driver, which stores volumes on the local filesystem of the Docker host.
nfs: A network file system driver for sharing volumes between multiple hosts.
glusterfs: A distributed filesystem driver for high availability.
Using a Volume Driver:
docker volume create --driver nfs --opt type=nfs --opt o=addr=host.docker.internal,rw --opt device=:/data nfs-volume
This example mounts an NFS share as a Docker volume, allowing multiple containers on different hosts to access the same data.
Ensuring Data Persistence Across Container Restarts
Ensuring data persistence across container restarts or even host failures is essential in production environments, especially when running databases or any service that writes to disk.
Let’s look at an example of how to persist data for a MySQL container:
docker run -d \
--name mysql \
-e MYSQL_ROOT_PASSWORD=root \
-v mysql_data:/var/lib/mysql \
mysql:latest
In this example, the mysql_data
volume is mounted at /var/lib/mysql
, the directory where MySQL stores its data. Even if the container is stopped and removed, the volume will retain the database files, ensuring that your data is not lost.
Backing Up and Restoring Docker Volumes
In production, data integrity is critical, so understanding how to back up and restore volumes is an important skill.
Backing Up a Volume:
To back up a Docker volume, you can use the docker run
command to create a tar archive of the volume contents:
docker run --rm -v mysql_data:/data -v $(pwd):/backup busybox tar cvf /backup/mysql_data_backup.tar /data
This command mounts the mysql_data
volume to /data
inside a busybox
container, then creates a tarball of the volume’s contents and stores it in the current directory on the host.
Restoring a Volume:
To restore a backup, you can use a similar command to extract the tar archive back into a new volume:
docker run --rm -v mysql_data:/data -v $(pwd):/backup busybox tar xvf /backup/mysql_data_backup.tar -C /data
This command extracts the backup tarball into the mysql_data
volume, restoring the data to its original location.
Cleaning Up Unused Volumes
Docker volumes can accumulate over time, especially when creating and destroying containers during development. To free up disk space, it’s important to periodically clean up unused volumes.
Removing Unused Volumes:
docker volume prune
This command removes all volumes that are not currently being used by any containers. Be careful with this command, as it will permanently delete data stored in unused volumes.
Conclusion
In this blog, we’ve explored the intricacies of persistent storage in Docker, focusing on Docker volumes, bind mounts, and tmpfs mounts. We’ve also covered advanced topics such as sharing volumes between containers, using volume drivers for distributed storage, and backing up/restoring volumes in production environments.
Docker volumes provide a powerful and flexible way to manage persistent data in containerized applications. By understanding the different storage options and best practices for managing volumes, you can build stateful applications that are resilient, scalable, and easy to manage.
In the next blog, we’ll dive into Docker Compose and how it simplifies the orchestration of multi-container applications.
Key Takeaways for Developers:
Use Docker Volumes for data persistence, as they are managed by Docker and can be easily shared across containers.
Bind Mounts are more flexible but less portable, as they tie containers to specific host directories.
tmpfs Mounts provide fast, in-memory storage for temporary data.
Always back up critical volumes and clean up unused volumes to maintain storage efficiency.
Subscribe to my newsletter
Read articles from Harsh Mange directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Harsh Mange
Harsh Mange
This is Harsh Mange, working as a Software Engineer - Backend at Argoid. I love building apps. Working on contributing to open-source projects and the dev community.