Linux History, File System Hierarchy and Evolution of Containerization
Let's revisit Linux history and file system hierarchy!
Linux is a generic name for a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds.
In 1969, Ken Thompson and Dennis Ritchie at Bell Laboratories created the UNIX operating system. They later rewrote it in the C programming language to make it work on more computers. So, UNIX became very popular.
About ten years later, Richard Stallman began the GNU project. He tried to make a new operating system kernel called Hurd, but it was never finished. However, this project led to the GNU General Public License (GPL), an important license for free software.
The kernel is the core part of an operating system. It helps the computer's hardware communicate with its software. It also manages many other important tasks. In simple terms, the kernel controls most of what happens on your computer.
During this time, other systems similar to UNIX were developed, like BSD and MINIX. However, these systems all had one thing in common: they didn't have a standard kernel that everyone used.
Then in 1991, a young man named Linus Torvalds started creating what we now call the Linux kernel. Since then many Linux distributions get created using the Linux kernel so therefore are commonly known as Linux operating systems. Some of them are:
Debian based OS : Ubuntu, Linux Mint, Kali Linux
Red Hat Enterprise Linux commonly referred to as RHEL, developed by Red Hat: Oracle Linux, Rocky LInux
openSUSE Linux is created by the openSUSE Project
Many Linux distributions now use the word "Linux" in their name, but the “Free Software Foundation” uses and recommends the name "GNU/Linux" to emphasize the use and importance of GNU software in many distributions.
File System Hierarchy
Go ahead and do an ls -l / to see the directories listed under the root directory, yours may look different than mine, but the directories should for the most part look like the following:
/ - The root directory of the entire filesystem hierarchy, everything is nestled under this directory.
/bin - Essential ready-to-run programs (binaries), includes the most basic commands such as ls and cp.
/boot - Contains kernel boot loader files.
/dev - Device files.
/etc - Core system configuration directory, should hold only configuration files and not any binaries.
/home - Personal directories for users, holds your documents, files, settings, etc.
/lib - Holds library files that binaries can use.
/media - Used as an attachment point for removable media like USB drives.
/mnt - Temporarily mounted filesystems.
/opt - Optional application software packages.
/proc - Information about currently running processes.
/root - The root user's home directory.
/run - Information about the running system since the last boot.
/sbin - Contains essential system binaries, usually can only be ran by root.
/srv - Site-specific data which are served by the system.
/tmp - Storage for temporary files
/usr - This is unfortunately named, most often it does not contain user files in the sense of a home folder. This is meant for user installed software and utilities, however that is not to say you can't add personal directories in there. Inside this directory are sub-directories for /usr/bin, /usr/local, etc.
/var - Variable directory, it's used for system logging, user tracking, caches, etc. Basically anything that is subject to change all the time.
How containerization and Linux work together:
Containerization is a well-established technology introduced long ago. It uses specific features of the Linux operating system to create isolated environments called containers. These containers allow multiple separate Linux systems to run on one computer without the heavy resource use of traditional virtual machines.
Two key Linux features make containerization possible:
Chroot: This is a tool that changes how a running program sees the file system. It basically creates a separate area for the program, keeping it isolated from the rest of the system. When you use chroot, it changes the root directory / for a program. This means the program can only access files in the "new" root directory you’ve assigned and not the rest of the system.
Namespaces: These are a core part of Linux that separate different types of resources. They let different processes have their own view of things like process IDs, user IDs, and network settings. This separation is crucial for container technology.
Very quickly, more technologies combined to make this isolated approach a reality. Control groups (cgroups) is a kernel feature that controls and limits resource usage for a process or groups of processes. And systemd, an initialization system that sets up the userspace and manages their processes, is used by cgroups to provide greater control over these isolated processes. Both of these technologies, while adding overall control for Linux, were the framework for how environments could be successful in staying separated.
let’s journey to 1979, when the concept of containers first emerged.
**1979: Unix V7 -**During the development of Unix V7 in 1979, the chroot system call was introduced, changing the root directory of a process and its children to a new location in the file system : segregating file access for each process. Chroot was added to BSD (Berkeley Software Distribution) in 1982.
2000: FreeBSD Jails - FreeBSD maintains a complete system, delivering a kernel, device drivers, userland utilities, and documentation, as opposed to Linux, which only delivers a kernel and drivers, and relies on third parties. FreeBSD Jails allows administrators to partition a FreeBSD computer system into several independent, smaller systems – called “jails” – with the ability to assign an IP address for each system and configuration such as GNU for system software.
**2001: Linux VServer -**Linux VServer is a jail mechanism that can partition resources (file systems, network addresses, memory) on a computer system.
**2004: Solaris Containers -**In 2004, the first public beta of Solaris Containers was released that combines system resource controls and boundary separation provided by zones, which were able to leverage features like snapshots and cloning from ZFS.
2005: Open VZ (Open Virtuzzo) - It uses a patched Linux kernel for virtualization, isolation, resource management and checkpointing.A patch is a set of changes applied to the source code of software, in this case, the Linux kernel. It typically includes modifications that fix bugs or vulnerabilities and can be distributed as files that describe the changes needed to update the code base
2006: Process Containers - Process Containers (launched by Google in 2006) was designed for limiting, accounting and isolating resource usage (CPU, memory, disk I/O, network) of a collection of processes.
2008: LXC - LXC (LinuX Containers) was the first, most complete implementation of Linux container manager. It was implemented in 2008 using cgroups and Linux namespaces, and it works on a single Linux kernel without requiring any patches.
2011: Warden - CloudFoundry started Warden in 2011 , which can isolate environments on any operating system, running as a daemon and providing an API for container management. It developed a client-server model to manage a collection of containers across multiple hosts, and Warden includes a service to manage cgroups, namespaces and the process life cycle.
2013: LMCTFY - Let Me Contain That For You (LMCTFY) kicked off in 2013 as an open-source version of Google’s container stack, providing Linux application containers.
2013: Docker - Docker also used LXC in its initial stages and later replaced that container manager with its own library, libcontainer. But there’s no doubt that Docker separated itself from the pack by offering an entire ecosystem for container management.
**2016: The Importance of Container Security Is Revealed -**With the wide adoption of container-based applications, systems became more complex and risk increased, laying the groundwork for container security. This led to a shift left in security (practice of integrating security measures earlier in the software development lifecycle (SDLC) rather than addressing them at the end ), making it a key part of each stage in container app development, also known as DevSecOps. The goal is to build secure containers from the ground up without reducing time to market.
2017: Container Tools Become Mature - In 2017, many container management tools entered the mainstream. Kubernetes was adopted by the Cloud Native Computing Foundation (CNCF) in 2016, and in 2017 VMWare, Azure, AWS, and Docker announced their support for it.
This was also the year of early tooling that helped manage important aspects of container infrastructure. Ceph and REX-Ray set standards for container storage, while Flannel connects millions of containers across data centers.
Adoption of rkt and Containerd by CNCF - Docker’s donation of the Containerd project to the CNCF in 2017 is symbolic of this concept, as well as the CNCF’s adoption of the rkt (pronounced “rocket”) container runtime around the same time.
Kubernetes Grows Up - Kubernetes supports increasingly complex classes of applications – enabling enterprise transition to both hybrid cloud and microservices. At DockerCon in Copenhagen, Docker announced they will support the Kubernetes container orchestrator, and Azure and AWS fell in line, with AKS (Azure Kubernetes Service) and EKS, a Kubernetes service to rival proprietary ECS. It was also the first project adopted by the CNCF and commands a growing list of third-party system integration service providers.
2018: The Gold Standard - The massive adoption of Kubernetes pushed cloud vendors such as AWS, Google with GKE (Google Kubernetes Engine), and Azure, to offer managed Kubernetes services. Furthermore leading software vendors such as VMWare, RedHat, and Rancher started offering Kubernetes-based management platforms.
Open source projects such as Kata containers, gVisor, and Nabla attempt to provide secured container runtimes with lightweight virtual machines that perform the same way container do, but provide stronger workload isolation.
Another innovation in 2018 was Podman, a daemonless, open-source, Linux-native tool designed to manage containers and pods (groups of containers).
2019: A Shifting Landscape - New runtime engines now started replacing the Docker runtime engine, most notably containerd, an open source container runtime engine, and CRI-O, a lightweight runtime engine for Kubernetes. Docker Enterprise was acquired and split off, resulting in Docker Swarm being put on a 2-year end-of-life horizon. At the same time, we witnessed the decline in popularity of the rkt container engine, while officially still part of the CNCF stable.
VMware doubled down on its commitment to Kubernetes by first acquiring Heptio and then Pivotal Software (with both PAS and PKS).It aims to provide enterprises with cloud-like capabilities for cloud-native deployments in their on-premises environments.
Platforms like Knative, a Kubernetes-based serverless workload management platform, gained traction among organizations.
In 2019, Kubernetes-based hybrid-cloud solutions were launched, including Google Anthos, AWS Outposts, and Azure Arc. These cloud platforms blur the traditional lines between cloud and on-premises environments, allowing management of on-premises and single-vendor cloud clusters
2020: Kubernetes Grows Up - In 2020, Kubernetes matured and added several features that provided much-needed support for “day 2” operations.
Dockershim Removal from Kubernetes - Kubernetes removed Dockershim, a container runtime interface (CRI) that allowed Docker containers to run on Kubernetes. The removal of Dockershim was not a rejection of Docker, but rather a move towards standardization. Kubernetes wanted to standardize on the Container Runtime Interface (CRI) as the way to interface with all container runtimes, and Dockershim was a non-standard, Docker-specific legacy code. The removal of Dockershim means that developers need to use a CRI-compliant runtime like Containerd or CRI-O to run containers on Kubernetes.
Ingress API - The Ingress API handles external access to services, exposing HTTP and HTTPS routes. It performs tasks such as load balancing, providing name-based virtual hosts, and SSL/TLS termination.
Kubectl Node Debugging - kubectl node debugging allows users to debug nodes via kubectl, allowing to inspect running pods without restarts or container entry. It enables filesystem checks, debug utility execution, and network requests via the host namespace.
This feature, enabled by default from Kubernetes 1.20, aims to eliminate SSH for node debugging and maintenance.
Kubernetes Topology Manager - The Kubernetes Topology Manager, introduced in version 1.18 as a beta feature, is a kubelet component that reduces latency and enhances performance in critical applications. It serves as a single information source for various components through an interface called Hint Providers. This enables components to make resource allocation decisions in line with the topology, delivering low latency and optimized performance for critical workloads .
2021: Containers for the Enterprise - Many vendors worked to make Kubernetes more user-friendly and accessible for organizations.
Multicluster Kubernetes management - Projects such as the Cluster API, the Kubernetes Multi-Cluster API, Hypershift, and kcp aimed to help organizations better manage multicluster Kubernetes environments. The growing adoption of GitOps, cloud and edge computing, and increasing demand for multiclusters as organizations expand.
Kubernetes autoscaling evolves - The Kubernetes Event-Driven Autoscaling (KEDA) project matured and received approval from the CNCF as it demonstrated the ability to extend adoption for end-users. KEDA, installed as a Kubernetes operator, adds or removes cluster resources based on external data source events. This development marked the growth and expansion of Kubernetes deployments in the industry.
MITRE ATT&CK Framework for Containers - It was developed to provide a detailed understanding of the security risks associated with container environments, and how attacks on these environments can be detected and prevented. It includes a variety of tactics and techniques, such as Initial Access, Execution, Persistence, Privilege Escalation, Defense Evasion, Credential Access, Discovery, Lateral Movement, Collection, Exfiltration, and Command and Control.
eBPF Foundation - Extended Berkeley Packet Filter (eBPF) is a technology that can run sandboxed programs in the Linux kernel without changing kernel source code or loading kernel modules.
2022: Record Adoption of Container Technologies
Record adoption of Kubernetes - According to a 2022 report from the CNCF, 96% of surveyed participants were either using or evaluating Kubernetes, and 79% used managed services like EKS, AKS, or GKE.
Kubernetes becomes widely accessible
Azure Container Apps - Azure Container Apps is a serverless container service offered by Microsoft Azure. It allows developers to deploy and run containerized applications on a fully managed platform. Azure Container Apps supports both Linux and Windows containers and offers features like automatic scaling, integrated CI/CD, and enterprise-grade security.
Increase in edge usage - Several CNCF open-source projects, including KubeEdge, SuperEdge, and Akri, can facilitate edge Kubernetes deployments.
Increasing use of stateful deployments - While containers are designed to be ephemeral and stateless, most applications still require some form of persistent storage. The community has developed several workarounds to enable stateful deployments in Kubernetes. These include improved support for persistent volumes (PVs), Kubernetes-native backup architecture, implementing a data backup plan with automation, recovering in the correct order, and utilizing a process agnostic to database types.
A Linux system is divided into three main parts:
Hardware - This includes all the hardware that your system runs on as well as memory, CPU, disks, etc.
Linux Kernel -It manages the hardware and tells it how to interact with the system.
User Space - This is where users will be directly interacting with the system.
Subscribe to my newsletter
Read articles from Ruchi Lamichhane directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Ruchi Lamichhane
Ruchi Lamichhane
I am a tech ethusiast with passion for technology, embracing the world of continuous integration, automation, and collaboration to make a meaningful impact in the dynamic realm of DevOps.