Kubernetes: Cluster, Nodes, Brain

Introduction

In our last post, we explored why Kubernetes is such a game-changer, tackling the messy challenges of managing containerized applications at scale. We learned it automates deployment, scaling, healing, and networking through a declarative approach – tell it what you want, and it works to make it happen.

But how does it actually do that? What are the pieces that make up this powerful system? It's time to peek under the hood and understand the core architecture of Kubernetes. In this post, we'll meet the main players: the Cluster, the Nodes, and the all-important Control Plane.

The Big Picture: What is a Kubernetes Cluster?

At the highest level, a Kubernetes Cluster is a set of machines, called Nodes, that run your containerized applications. Think of it as a unified computing resource pool made up of individual computers, all working together under Kubernetes' command.

You interact with the Cluster as a single entity, mostly through the Kubernetes API. You tell the cluster what applications to run, how many copies, and what resources they need, and Kubernetes handles distributing and managing that work across the available Nodes.

A cluster must have at least one Worker Node (where your applications actually run) and at least one Control Plane Node (which manages the cluster).

Nodes: The Worker Bees of the Cluster

A Node is a worker machine in Kubernetes, which could be a physical server in your data center or a virtual machine hosted by a cloud provider (like an EC2 instance in AWS or a Compute Engine VM in GCP). Each Node provides the necessary CPU, Memory, and Network resources for your applications.

There are two main types of Nodes:

Worker Nodes:
- Job: Run the actual application containers. This is where the work gets done!
- Components: They run specific Kubernetes agents that allow them to be managed by the Control Plane and execute tasks like starting/stopping containers. We'll detail these components below.
- A cluster typically has many Worker Nodes.
Control Plane Nodes: (Historically sometimes called "Master Nodes")
- Job: Manage the state of the cluster. They make global decisions about the cluster (like scheduling applications) and detect and respond to cluster events (like a Node becoming unavailable).
- Components: They run the core Kubernetes services that form the "brain" of the cluster.
- For high availability and fault tolerance, production clusters usually run multiple Control Plane Nodes (often 3 or 5) to avoid a single point of failure. For learning environments like Minikube, you might just have one machine acting as both the Control Plane and a Worker Node.

The Brains of the Operation: The Control Plane Components

The Control Plane is the heart and brain of Kubernetes. It's responsible for maintaining the desired state of the cluster. It's not a single entity but a collection of processes running on one or more Control Plane Nodes. Let's meet the key players:

kube-apiserver (API Server):
- Role: The frontend of the Control Plane. It's the gatekeeper and central hub for all communication into and within the cluster.
- Function: Exposes the Kubernetes API. When you use kubectl (the command-line tool), talk to the dashboard, or any other tool interacts with Kubernetes, it's talking to the kube-apiserver. It validates and processes API requests, acting as the single source of truth (though it relies on etcd for storage).
etcd:
- Role: The memory or database of the cluster.
- Function: A consistent and highly-available distributed key-value store. It reliably stores all cluster data – the configuration, state, desired state, metadata, secrets, everything. The kube-apiserver is the only component that talks directly to etcd; other components query the apiserver. Losing etcd data means losing the state of your cluster, hence why it's critical to back it up and run it in a highly available configuration.
kube-scheduler:
- Role: The matchmaker.
- Function: Watches for newly created Pods (the basic running unit in K8s, which we'll cover next post!) that don't have a Node assigned yet. Its job is to find the best available Worker Node to run that Pod based on resource requirements (CPU, memory), constraints, policies, affinity/anti-affinity rules, data locality, etc. It makes the decision; it doesn't actually place the Pod (the kubelet does that).
kube-controller-manager:
- Role: Runs various controller processes that regulate the state of the cluster.
- Function: Think of controllers as loops that watch the cluster's shared state (via the apiserver) and work to make the current state match the desired state. It bundles several core controllers into one binary for efficiency, including:
  - Node Controller: Responsible for noticing and responding when nodes go down.
  - Replication Controller / ReplicaSet Controller: Maintains the correct number of Pods for replicated workloads.
  - Endpoint Controller: Populates the Endpoints object (joins Services & Pods).
  - Service Account & Token Controllers: Create default accounts and API access tokens for new namespaces.
  - ...and many others.
cloud-controller-manager (Optional):
- Role: Integrates with specific cloud provider APIs.
- Function: Embeds cloud-specific control logic. This allows Kubernetes to interact with your cloud provider's features (like Load Balancers, Storage Volumes, Node management) without having cloud-specific code baked into the main kube-controller-manager. It runs controllers like:
  - Node Controller: For checking the cloud provider to see if a node was deleted in the cloud.
  - Route Controller: For setting up routes in the underlying cloud infrastructure.
  - Service Controller: For creating, updating, and deleting cloud provider load balancers.
- If you're running Kubernetes on-premises, you won't have this component.

The Brawn: Worker Node Components

Worker Nodes need agents running on them to communicate with the Control Plane and manage the containers assigned to them. The key components are:

kubelet:
- Role: The primary agent running on each Worker Node (and sometimes Control Plane nodes if they also run workloads).
- Function: It registers the Node with the apiserver and watches for Pods that have been scheduled to its Node by the kube-scheduler. It then works with the Container Runtime to start the Pod's containers, mounts necessary volumes, manages container health (via probes), and reports the status of the Node and its Pods back to the apiserver. It's the kubelet that actually executes the tasks assigned by the Control Plane.
kube-proxy:
- Role: The network plumber on each Node.
- Function: Maintains network rules on the Node to allow network communication to your Pods from inside or outside the cluster. It understands Kubernetes Services (which we'll cover soon) and ensures traffic destined for a Service's IP address gets routed correctly to one of the backing Pods. It usually does this using technologies like iptables or IPVS.
Container Runtime:
- Role: The software responsible for running containers.
- Function: Kubernetes is flexible and supports several runtimes that adhere to the Kubernetes Container Runtime Interface (CRI). Examples include:
  - Docker (though often via an intermediate layer like dockershim which is being deprecated/removed in newer K8s versions)
  - containerd (A popular, robust runtime often used by default now)
  - CRI-O (Another OCI-compliant runtime focused on Kubernetes)
- The kubelet interacts with the chosen Container Runtime to pull images, start, stop, and manage the lifecycle of the containers for assigned Pods.

Diagram: High-Level Kubernetes Architecture

High-Level Kubernetes Architecture

Kubernetes Architecture 101: Meet the Cluster, Nodes, and the Brain

Subscribe to my newsletter

Shrihari Bhat

Shrihari Bhat