Hi everyone,

Today, we're diving into the concept of containerization, exploring why it's superior to virtualization. We'll cover how Docker operates, its architecture and features, images, essential Docker commands, and registries. Plus, we'll get hands-on by building and containerizing a simple Node.js application.

My aim here is to provide a comprehensive analysis and a step-by-step guide to containerizing an application. This is going to be a long one, So without any further ado, let’s get started!

What is containerization?

Before getting deep, let’s look at the official definition of the containerization concept itself:

Containerization is the standard packaging unit of software code with just the operating system(OS) libraries and dependencies required to run the code to create a single lightweight executable - called a container - that runs consistently on any infrastructure.

Docker is the application that executes this principle to put it out in simple terms. (like a containerization instance!)

But why is this even needed?

Virtualization

Back in the days, before containerization became popular, there was the concept of virtualization. Let's take a moment to explore that.

According to VMware, who introduced this concept,

Virtualization is the process of creating a software-based(or virtual) representation of applications, servers, storage and networks to reduce IT expenses while boosting efficiency and agility.

Let’s say you have your Windows OS running on your machine. And, you want to install Linux on your Windows machine and run two OS separately allocating CPU, memory, and disk space.

Either you can do this via:
1. Dual booting: Install Linux alongside Windows and choose which OS to boot at startup. (or)

2. Virtualization: Use a hypervisor(just software that abstracts the hardware) like VirtualBox or VMware to run Linux inside a virtual machine (VM) on Windows.
With virtualization, each VM has a full OS, consuming a lot of resources, which can slow things down. This is where docker comes in as a lightweight alternative, eliminating the need for full VMs by using containers instead.

Hardware Infra: Physical machine where everything runs.
Hypervisor: Hyper-V, VMware acts as an intermediate component and allows multiple VMs to run on the same physical machine.
Virtual machines: Have a full OS with required libraries and dependencies. Each VM has its OS, consuming more resources.

Below are some of the virtualization tools that are most commonly used:

Existing Problems of Virtualization:

Virtualization is handy but has some downsides that make it not so great for certain things. It's heavier and slower to start, uses up a lot of resources, and is tricky to manage. Virtual machines take a while to boot up, aren't great at scaling, and duplicate the operating system, which means less portability. These issues show why we need better options like containers, which are a lighter choice since they skip the full virtual machine setup.

To solve these issues, Docker was introduced, providing a lightweight and efficient alternative to VMs.

Still, VMs are super important for lots of enterprise, security, and old-school uses. So, when should you use them?

If you need full OS-level isolation (like for systems with multiple users).
If you're working with GUI-based apps that need a full OS.
If you want to run old software that doesn’t play nice with containers.
If you need stronger security boundaries than what containers offer.

Installation

Installing Docker is straightforward and varies slightly depending on the OS you’re using. Here’s how you can do it:

On Linux🐧
Docker runs natively on Linux. To install it:

curl -fsSL https://get.docker.com | sh
sudo systemctl start docker
sudo systemctl enable docker

#verify its installation
sudo docker version

A typical output showing Docker version 19.03.1 might appear as follows:

vagrant@ubuntu-bionic:~$ sudo docker version
Client: Docker Engine - Community
 Version:           19.03.1
 API version:       1.40
 Go version:        go1.12.5
 Git commit:        74ble89
 Built:             Thu Jul 25 21:21:05 2019
 OS/Arch:           linux/amd64
 Experimental:      false

That’s it! you’re ready to run containers. Let’s see how to run a hello world from Docker itself.

docker run -it Hello-world

On Windows & MacOS
Step 1: Download Docker Desktop from Docker’s official website
Step 2: Install and run the application.
Step 3: Ensure WSL backend (for Windows) or macOS virtualization is enabled.

And, here comes the twist!

We usually assume that when Docker is installed on Windows or macOS, it runs native Windows or macOS containers. But guess what? That's not true at all!
Behind the scenes, Docker only runs Linux Containers (LXC). Even on Windows or macOS, it does this by using Hyper-V (on older versions) or WSL 2 (on newer versions) to create a lightweight Linux VM. Make sense?

Architecture

Docker introduced OS-level virtualization, where containers share the host kernel but operate in isolated user spaces. This approach allows the host and the Docker engine to allocate hardware resources to containers dynamically. And, hence none of the containers permanently hold the compute resources of the host system.

Here’s a glimpse of Containerization vs Virtualization Architecture:

Dockerization vs virtualization

When you install Docker it comes with the below standard directories in Linux-based systems as it runs on Linux Containers (LXC), and plays an important role in the Docker ecosystem.

/bin	binary executables
/sbin	system binary executable files
/etc	config files
/lib	contains library files used by /bin
/usr	user-related files and utilities
/var	variable data, log files, spool files
/root	root user home directory

Layer-based architecture

Docker images are built in layers, so every change (like installing software, adding files, or modifying configurations) creates a new layer on top of the previous one.

It follows the Copy-on-Write (CoW) principle, meaning Docker doesn't duplicate layers unnecessarily—it only stores the differences.

Each layer gets an ID calculated using a SHA-256 hash of the layer's contents. The Image ID shown by Docker commands (like docker images) is the first 12 characters of this hash.

Some Terminologies!

Docker daemon: The background service (dockerd) that manages objects like containers, images, and networks.
Docker Images: Read-only templates with app code, dependencies, and configs used to create containers. Built using layers.
Docker engine: The core component of Docker that enables containerization, consisting of the Daemon, CLI, and APIs.
Docker Registries: Storage and distribution hubs for Docker images (e.g., Docker Hub, AWS ECR, GitHub Container Registry).

Example: Instead of installing MongoDB manually (which takes up space and setup time), you can simply pull the lightweight official MongoDB image. This way, MongoDB runs instantly in a container without cluttering your system.
Here, the Docker CLI sends commands to the Docker Daemon via the REST API, which processes them to build, run, and manage containers. The daemon handles the underlying container operations, while the CLI provides an interface for users to interact with Docker.

docker pull mongo
docker run -d -p 27017:27017 --name my-mongo mongo

Runtime: Responsible for running containers:
- containerd: Manages container lifecycle (pulling images, creating/running containers).
- runc: A low-level runtime that starts and runs containers, following OCI standards. This itself is a separate project under CNCF by the Open Container Initiative alongside CRI-O, LXC, etc…,

But, How does it work internally?

Imagine you have a basic Node.js "Hello World" application, and now you need to containerize it. Let's assume your Node.js project is organized as follows.

/node-app
   ├── server.js
   ├── package.json
   ├── package-lock.json
   ├── Dockerfile

The code for the app(server.js) is written as follows:

const express = require("express");
const app = express();

app.get("/", (req, res) => {
    res.send("Hello world, Docker!");
});

const PORT = process.env.PORT || 3000;
app.listen(PORT, () => console.log(`Server running on port ${PORT}`));

For that, you need to write a docker file with some instructions. A Dockerfile is a script that tells Docker how to build the image for our application. Here’s a sample docker file for our hello-world app.

FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
EXPOSE 3000
CMD ["node", "server.js"]

Here’s a simple explanation of the Dockerfile. The Docker commands should be written in Capitalized Letters and saved with the name dockerfile.

FROM node:18-alpine	Uses a lightweight node.js base image
WORKDIR /app	sets /app as the working directory inside the container.
*COPY package .json ./**	Copies package files first to install the required dependencies and libraries.
RUN npm install	Installs dependencies inside the container.
COPY . .	Copies the entire project directory.
EXPOSE 3000	Tells Docker the app runs on port 3000.
CMD [“node”,”server.js”]	specifies the default command to start the app.

Once you write a Dockerfile, it will build an image for that application that is easy to share and portable across any machine, and it’s platform independent. This solves the “It works on my machine, why not yours?” problem, which is the million-dollar problem faced by developers. If you are familiar with the OOP concept, you would come across the terms “class” and “object.” Similarly, here an image is a class (template) that can run multiple instances of it (objects), which are called containers.

To build an image from the above docker file, run the command:

docker build -t Hello-world:v1 .

docker	It calls the docker CLI
build	Build an image out of this Dockerfile.
t	stands for tag where you can name the application. In this case, it is “Hello-world”.
v1	version of your application that you’re specifying explicitly. If do not specify it, then the default would be the latest
dot(.)	represents that Dockerfile is present in the current directory.If not then, provide its absolute path.

The output should look like this,

root@Devisri:/app# docker build -t Hello-world:v1 .
Sending build context to Docker daemon  5.12kB
Step 1/7 : FROM node:18-alpine
 ---> a1b2c3d4e5f6
Step 2/7 : WORKDIR /app
 ---> Running in 123456789abc
Removing intermediate container 123456789abc
 ---> def789ghi012
Step 3/7.. Step 4/7 ..Step 5/7.. Step 6/7..
Step 7/7 : CMD ["node", "server.js"]
 ---> Running in 789ghi234jkl
Removing intermediate container 789ghi234jkl
 ---> 987xyz654mno
Successfully built 987xyz654mno
Successfully tagged Hello-world:v1

As we can see, the image is built on layers on top of one another and each layer has its own SHA-key that can be reusable and efficient and each command runs one after another. You can crosscheck whether the image has been built or not using the docker images command with Image ID.

root@Devisri:/app# docker images
REPOSITORY       TAG       IMAGE ID       CREATED         SIZE
Hello-world      v1        987xyz654mno   5 minutes ago   50MB

docker run -d -p 3000:3000 --name my-container Hello-world:v1

The above one run(runtime) in the detached mode (-d) in the background with port(-p) maps port 3000 of the container to port 3000 on the system with the (name)my-container with the image name we built “Hello-world:v1”

Check if the container is running:

docker ps

CONTAINER ID  IMAGE           COMMAND       CREATED       STATUS       PORTS                    NAMES
abc12345xyz   Hello-world:v1  "node server.js"  2 minutes ago   Up 2 minutes  0.0.0.0:3000->3000/tcp   my-container

Now open your browser and go to http://localhost:3000. You should see,

Hello world,Docker!

Congrats, dude! Your Node.js app is running successfully as a Docker container. Now, the next step is the push the image to the docker hub.

docker login

Log in with your username and password. And, then push the image to your docker hub.After that,verify on dockerhub and you should see your image there!Good job!

docker push <username>/hello-world:v1

Now, you can or anyone can pull and run the container anywhere using:

docker pull <username>/hello-world:v1
docker run -d -p 3000:3000 <username>/hello-world:v1

Note that there are two types of registries for pushing Docker images.
1. Public Registries: Public registries, such as Docker Hub and GitHub Container Registry, allow anyone to access and pull images. They are ideal for open-source projects, software sharing, and collaboration.
2. Private Registries: Private registries, such as AWS ECR (Elastic Container Registry), GitHub Packages, and self-hosted Harbor, restrict access to authorized users, providing enhanced security and control over images. They are commonly used in enterprises for managing sensitive applications, custom-built software, and internal deployments.

Now you can stop or delete the containers and images if you want to and be careful when you delete the images cuz it can’t be undone!

root@Devisri:~# docker ps
#output
CONTAINER ID   IMAGE              COMMAND          CREATED         STATUS         PORTS                    NAMES
a1b2c3d4e5f6   hello-world:v1     "node server.js" 2 minutes ago   Up 2 minutes   0.0.0.0:3000->3000/tcp   my-container

docker stop a1b2c3d4e5f6
#output
a1b2c3d4e5f6
docker rm a1b2c3d4e5f6
#output
a1b2c3d4e5f6

root@Devisri:~# docker ps -a
CONTAINER ID   IMAGE   COMMAND   CREATED   STATUS   PORTS   NAMES
#delete the image itself
root@Devisri:~# docker images
REPOSITORY       TAG       IMAGE ID       CREATED        SIZE
hello-world      v1        123abc456def   10 minutes ago  20MB
root@Devisri:~# docker rmi hello-world:v1Untagged: hello-world:v1
Deleted: sha256:123abc456def789ghi...
Deleted: sha256:xyz987654lmn...

Now both the images and their instances have been deleted successfully! That's it! If you've read this far, you've done a great job🙌!

Putting it all together!

The Docker Life cycle can be represented as follows:

The above image visually represents the architecture of Docker and how container management works under the hood. Here's a brief breakdown:

Docker Client
- The user interacts with Docker through the Docker client by running commands like docker run or docker build.
containerd
- A high-level container runtime that manages the entire lifecycle of containers.
- It does not create containers directly but uses runc to handle execution.
Shim and runc
- runc: A lightweight container runtime that actually creates and starts the containers.
- shim: Maintains the lifecycle of the container after runc exits. This avoids keeping runc running and allows the daemon to restart without stopping the container.

Conclusion

And that’s a wrap! 🚀 We’ve journeyed through the world of containerization, explored why Docker is a game-changer over traditional virtualization, and even containerized a Node.js app step by step. With its lightweight, scalable, and efficient architecture, Docker has revolutionized how we build, ship, and run applications.

Now that you have a solid understanding, it’s time to get hands-on and experiment with Docker yourself. In the next blog, We will dive more into concepts like networking and volumes and some advanced concepts like Docker swarm!

Thanks for reading!

Happy containerizing! 🐳✨

Let’s connect!
LinkedIn
Github

Docker for Absolute Beginners

Table of contents