Docker Day 4: Here's How You Can Reduce the Size of Your Docker Image by 99.7%

RachanaRachana
4 min read

What is Docker Multi-Stage-Build?

Imagine you’re a chef making a delicious pizza. You want to prepare the dough, sauce, and toppings separately and combine them at the end. Similarly, Docker multi-stage builds allow you to create different stages in a Dockerfile, each focusing on a specific part of your application, and then combine them into a final image. This approach helps you create smaller, more efficient images.

With multi-stage builds, you can separate the build environment from the runtime environment, enabling you to compile and package your application in one stage and then copy the resulting artifacts into a lightweight runtime image in another stage.

  • Stage 1 (build stage) installs dependencies, copies source code, and builds the application.

  • Stage 2 uses a lightweight scratch image for the runtime environment. It copies the built artifacts from the build stage and installs only production dependencies.

  • Finally, it specifies the command to run the application.

https://github.com/RachanaVenkat/docker-multi-stage-build - fork this repository to perform this demo.

Understanding the Dockerfile

In this demo, we are going to build the Docker Image of a simple calculator application developed in Go(calculator.go), which is present at the same location as that of the Dockerfile.

As it is seen here, the Dockerfile is divided into two stages - one for the build environment and other for the runtime environment

Stage 1

  1. FROM ubuntu AS build: This starts the build stage using the ubuntu image as the base. The AS build part names this stage "build," which we will use later.

  2. RUN apt-get update && apt-get install -y golang-go: This updates the package list and installs the Go programming language (golang-go).

  3. ENV GO111MODULE=off: This sets the environment variable GO111MODULE to off, which is a setting related to Go modules. This setting is used here to disable module support if you’re building without modules.

  4. COPY . .: This copies all the files from your local directory (where the Dockerfile is located) into the Docker image’s working directory.

  5. Optional - RUN CGO_ENABLED=0 go build -o /app .: This command compiles the Go application. The CGO_ENABLED=0 part disables CGO (C bindings) to create a statically linked binary. The -o /app flag specifies that the output binary should be named app and placed in the root directory (/). The . at the end specifies the current directory as the source code location.

Stage 2

  1. FROM scratch: This starts the runtime stage using scratch as the base image. scratch is an empty image and represents the starting point for building images. It contains nothing by default, which provides better security. Since Golang doesn't need a runtime to run, a scratch image is the most efficient.

  2. COPY --from=build /app /app: This copies the compiled binary (/app) from the build stage into the root directory of the new image. The --from=build part specifies that the source of the copy operation is the build stage.

  3. ENTRYPOINT ["/app"]: This sets the entrypoint for the container to /app, which means when the container starts, it will execute the /app binary.

Summary

  1. Build Stage (Kitchen):

    • We set up a full kitchen with all the tools and ingredients.

    • Prepare the dough and toppings (compile the application).

    • Output a ready-to-bake pizza (compiled binary).

  2. Runtime Stage (Pizza Box):

    • Use a clean, minimal box to keep the pizza (a small, empty base image).

    • Place the prepared pizza into the box (copy the compiled binary).

    • Set instructions for serving the pizza (define the entrypoint).

This multi-stage build ensures our final product (the Docker image) is lightweight, containing only the necessary elements (the compiled binary), just like a pizza delivered in a minimal, clean box without all the kitchen mess.

Practical Demonstration

https://github.com/RachanaVenkat/docker-multi-stage-build - this repository consists of 2 folders- one consists on the Dockerfile without multi-stage build and one with multi-stage build.

We will be building Docker Images from both the Dockerfiles and compare its sizes

  1. Create an ubuntu EC2 instance, with docker and Golang installed.

    sudo apt update

    sudo apt install docker.io

    sudo apt install golang-go

  2. Clone my repository -

    git clone https://github.com/RachanaVenkat/docker-multi-stage-build.git

  3. Navigate to the folder dockerfile-without-multistage and build the image

    docker build -t simplecaclci .

  4. Check the size of the docker image created

    docker images

    870MB for a simple calculator is hefty isn't it?

  5. Now, navigate to the folder dockerfile-with-multistage and build the image

    docker build -t multicalci .

There you see the difference guys. It is almost a 99.7% reduction in the size of the docker image, proving its efficiency.

This way, Docker multi-stage builds are a powerful technique for creating efficient and lightweight Docker images. By separating the build and runtime environments, you can significantly reduce the size of your final image, as demonstrated in our practical example with the Go-based calculator application.

This approach not only optimizes resource usage but also enhances security by minimizing the attack surface. Distroless images further enhance the security. Embracing multi-stage builds in your Docker workflow can lead to more streamlined and maintainable deployments, making it an essential practice for modern containerized application development.

You must be feeling Great already!

0
Subscribe to my newsletter

Read articles from Rachana directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Rachana
Rachana

Hi to the fellow tech enthusiasts out there! 👋 I am an aspiring Cloud and DevOps Engineer ☁️ With strong foundation in containerization technologies like Docker and Kubernetes🐳 Capable of building resilient, secure and cost optimized infrastructure on AWS cloud - AWS SAA certified.☁️🔒 Currently learning to build CI/CD pipelines using Jenkins, github-actions, AWS CodePipeline, and many more.🛠️🔄 Exploring other tools like Ansible for configuration management and Terraform for Infrastructure as Code(IaC).🧩📜 Let's connect, learn and grow together! 🌟🤝