Learning Kubernetes for Beginners: Week 1 – Cluster Architecture, Pods

To be honest, this is not what I learned this week, but a combined effort of what I started a few months ago and what I learned in the last two days

Core-concepts

Cluster Architecture

The purpose of K8s is to deploy your application as containers in an automated fashion so that you can easily deploy any instance of your application
and easily enable communication between anything in your application
Worker Node: host application as containers
Master Node: Manage, plan, schedule, monitor nodes
- uses ETCD Cluster to store data, like what applications are being deployed, their time, and all other information in a key-value pair
- A kube-scheduler → identifies the right node to place a container on based on the container’s resource requirement, the worker node’s capacity, and any other configurations
- Controllers → That take care of different areas
  - Node Controller → responsible for onboarding new nodes to the cluster
    
    handling situations where nodes become unavailable and are destroyed
  - Replication Controller → makes sure that the desired number of containers are running at all times in a replication group
- Kube API-server → responsible for orchestrating all operations within the cluster
  - It exposes the K8s api that external users use to perform management operations on the cluster, as well as various clusters to monitor the state of the cluster and make necessary changes as required
As everything is run as a container, we need a Container Engine to run those containers
- DOCKER → Container Engine installed on all the nodes (worker and master)
- It doesn’t always have to be Docker; k8s supports other Container engines as well, like containerd or Rocket
kubelet(captain)

It is an agent that runs on each node in a cluster.
- It listens for instructions via kube-api-server and deploys or destroys containers on the nodes as required
- kube-api-server periodically fetches data from the kubelet to monitor the status of nodes with containers on them
Kube-proxy

service ensures that the necessary rules in-placed on the worker nodes to allow the containers running on them to reach each other

Docker vs Containerd

In the beginning, K8s was built to orchestrate Docker specifically
As K8s grew in popularity, users wanted to be able to use K8s with other container engines like RKT(Rocket)
So Kubernetes came with CRI (Container Runtime Interface)
CRI allows any vendor to work as a container Runtime as long as they adhere to the OCI Standards
- OPEN CONTAINER INITIATIVE (OCI)
  - imagespec → specifications on how an image should be built
  - runtimespec → standards on how any container runtime should be developed
But at the time, Docker didn’t support OCI standards, as it was built earlier than these standards were introduced, and as it was the dominant container runtime at the time, Kubernetes had to support it
K8s came up with dockershim →a hacky and temporary way to continue to support Docker outside the CRI
Docker includes many things, one of which is the daemon Docker runs on containerd → which supports OCI standards and can run as a runtime on its own, separate from Docker
In v1.24 k8s, removed dockershim completely,
If you don’t require Docker’s other features, you can directly use containerd ( Graduate CNCF member)
Containerd
- It has its own CLI called ctr
- Not very user-friendly
- only supports limited features
- for any other way you have to make api calls, which is not very friendly

    ctr
    ctr images pull <image-name>
    ctr run <image-name>

NerdCTL
- A better alternative is nerdctl
  - Provide a Docker-like CLI for ContainerD
  - nerdctl supports Docker Compose
  - nerdctl supports the newest features in containerd]
    - Encrypted container images
    - Lazy pulling
    - P2P image distribution
    - Image signing and verifying
    - Namespaces in Kubernetes

        nerdctl
        nerdctl run --name redis redis:alpine
        nerdctl run --name webserver -p 80:80 -d nginx

CRICTL
- crictl provides a CLI for CRI-compatible container runtimes
- Installed separately
- Used to inspect and debug container runtimes
  - Not to create containers ideally
- Works across different runtimes
```
  crictl
  crictl pull <image-name>
  crictl images
  crictl ps -a
  crictl exec -i -t <container-id> ls
  crticl logs <contianer-id>
  crictl pods # list pods
```
in v1.24
- The dokcershim.sock was replaced by the containerd sock

unix:///run/containerd/containerd.sock

unix:///run/crio/crio.sock

unix:///var/run/cri-dockerd.sock

ETCD

It is a distributed, reliable key-value store that is Simple, Secure, & Fast
key-value store
- stores information in the form of keys and values or files for each separate key
- You can add additional information in one of the documents without having to change all those documents
The default client that comes with etcd is the etcdctl client

./etcdctl set key1 value1 # creates entry in the DB
./etcdctl get key1 # get value of key

It is a leader-based distributed system. Ensure that the leader periodically sends heartbeats on time to all followers to keep the cluster stable
You should run etcd as a cluster with an odd number of members
Any resource starvation can lead to a heartbeat timeout, causing instability of the cluster. An unstable etcd indicates that no leader is elected. Under such circumstances, a cluster cannot make any changes to its current state, which implies that no new pods can be scheduled.
etcdctl and etcdutl → command-line tools to interact with etcd clusters, but they serve a different purpose
- etcdctl → primary CLI client for interacting with etcd over a network
  - used for day-to-day operations → managing keys and values, administering cluster, checking health, and more
- etcdutl → an administration utility designed to operate directly on etcd data files, including
  - migrating data between etcd versions,
  - defragmenting the database,
  - restoring snapshots
  - validating data consistency
Commands
- backup → backup an etcd directory
- cluster-health → check the health of the etcd cluster
- mk → make a new key with a given value
- mkdir → make a new directory
- rm → remove a key or a directory
- rmdir → removes the key if it is an empty directory or a key-value pair
- get → retrieve the value of key
- ls → retrieve a directory
- set → set the value of akye
- sedir → create a new directory or update an existing directory TTL
- update → update an existing key with a given value
- updatedir → update an existing directory
- watch → watch a key for changes
- exec-watch → watch a key for changes and exec an executable
- member → member add, remove, and list subcommands
- user → add, grant, and revoke subcommands
- role → role add, grant, and revoke subcommands

KIND	Version
POD	v1
Service	v1
ReplicaSet	apps/v1
Deployment	apps/v1

Kube Controller Manager

A controller is an office or department within the master ship with its own set of responsibilities to take important actions whenever a new “ship” enters, leaves or changes, or is destroyed

These offices are on
- Continuous lookout for the status of the ship
- Take necessary actions to remediate the situation
In K8s terms, the Kube Controller is a process that continuously monitors the state of various components within the system, and works towards bringing the whole system within a desired state
The Node Controller
- The Node Controller is responsible for monitoring the status of nodes & taking the necessary action to keep the application running → It does that via kube-apiserver
- The node controller checks the status of the nodes every 5 seconds that the node controller can monitor the health of the nodes
  - If it stops receiving heartbeat from a node, then it is marked as unreachable, but it waits for 40 seconds before marking it as UNREACHABLE
  - After the node is marked UNREACHABLE, it gives it 5m to come back up After that, it removes the pods assigned to that node and provisions them on another node if it’s part of a replica set
The Replication Controller
- It monitors the status of replica sets and ensures that the desired number of pods are available in all sets
- If a pod dies, it creates another one
How do you see these controllers, and where are they located in your cluster?
- They are all packaged into a single process known as Kube-Controller-Manager
- When you install the Kube-controller-manager, the different controllers get installed as well
- Install from the said link and then run it as a service. When you run it, you will get various list of options to choose from, here, you will get things we discussed, like
  - node-monitor-period
  - node-monitor-grace-period
  - pod-eviction-timeout
- There is a specific option ‘controllers’ to set which controller to enable
- By default, all of them are enabled
How do you view your kube-controller-manager server options?
- Depends on how you have set it up. If you have set up via kube-admin tool , it sets up the kube-controller-manager as a pod in the kube-system namespace on the master node
- You can see the options within the pod definition file created at
  - /etc/kubernetes/kube-controller-manager.yaml
- In a non-kube-admin setup, you can inspect the options located at the following path : /etc/systemd/system/kube-controller-manager.service
- You can also see the running process and effecting options by searching the process on master nodes
  
  ps -aux | grep kube-controller-manager

Kube Scheduler

responsible for scheduling pods on nodes (only deciding which pod goes on which node) , it doesn’t actually place them there that is the job of kubelet
- Kubelet creates pod on the ship
- Why need a Scheduler?
  
  Because there are many pods, you want to make sure that the right container goes on the right ship
  - In K8s , the scheduler decides which node the pods are placed on, depending on certain criteria.
  - You may have a pod with different resource requirements
  - You can have nodes in a cluster dedicated to certain applications
  - Scheduler looks at each pod and tries to find the best node for it
  - It has a set of memory and CPU requirements
  - Scheduler goes through two phases to identify the best node for the pod
  - Filter Nodes
  - Rank Nodes
- Install kube-scheduler
  - Get the kube-scheduler binary from Kubernetes docs page
  - Run it as a service
  - View kube-scheduler options via kubeadm
    - /etc/kubernetes/mainfests/kube-scheduler.yaml
  - ps -aux | grep kube-scheduler
Kubelet
- It’s like a captain on a ship
- Lead all activities on the ship, sole point of contact with the master ship, send back a report at regular intervals
- kubelet in the k8s worker node registers the node with the Kubernetes cluster
- When it receives instructions to load a container or a pod on the node, it requests the container run-time engine to pull the required image and run an instance
- The kubelet then monitors the node and the pod and sends reports to the kube-api server on a regular basis
- YOU MUST ALWAYS MANUALLY INSTALL KUBELET, not automatically deploy with kubeadm
Kube Proxy

Within a k8s cluster, every pod can reach every other pod. This is accomplished by deploying a pod networking solution to a cluster
- Pod Network: An Internal virtual network that expands to all the nodes in the cluster to which all the nodes or pods connect.
  - Through this network, they are able to communicate with each other
- Eg→, Web application deployed on the first node and DB on the second
  - Web App can reach the DB simply by using the IP of the pod
  - but no guarantee that the IP of the DB pod will always remain the same
  - Better way for the web app to access the DB is via using a service.
  - Create a Service to expose the DB application across the cluster
  - web app can now access the DB using the name of the service
  - Service also gets an IP address assigned to it
  - Whenever a pod tries to reach the service, using its IP or name, it forwards the traffic to the DB (backend part)
  - The service cannot join the pod network as the service is not the actual thing, it is not a container like pod, it is a virtual component that only lives in the kubernetes memory. It does not have any actively listening process
  - So, how is service accessible across the cluster from any node?
    
    Via Kube-proxy, → a process that runs on each node in a k8s cluster, its job is to look for new services
    - Every time a new service is created, it creates the appropriate rules on each node to forward the traffic to those services to the backend pods
    - One way, it create an IP Table rules on each node in a cluster to forward the traffic heading to the IP of the service to the IP of the actual pod
  - Install kube-proxy
    - Download binary form k8s release page , download it and run it as a service
    - Kubeadm tool deploys kube-proxy as pods on each node, in fact it is deployed as daemonset so every node has at least one pod in the cluster

Pods

Assumption:
- Docker Images been created
- Kubernetes cluster already been setup and running
- All services are in running state
With K8s our ultimate aim is to deploy our application in the form of containers on a set of machine that are configured as worker nodes in a cluster
K8s does not deploy containers directly on the worker nodes
Containers are encapsulated into a Kubernetes object known as pods
A pod is a single instance of an application
A pod is the smallest object that you can create in Kubernetes app

pod-definition via YAML

  apiVersion: v1
  kind: Pod
  metadata:
      name: myapp-pod
      labels: # to mark the pod for later user(can have any number of key-value pairs)
          app: myapp
          type: front-end
  spec:
      containers: # List/Array
          - name: nginx-contianer
              image: nginx

# To create the pod from the file
kubectl create -f <filename>

Controllers → brain behind k8s
- They are the processes that monitor the k8s objects and respond accordingly

ReplicaSets

What is a Replica? Why do we need a replication controller?

If there is a single pod running in our application, if the pod fails, then the entire application will be down
- In order to prevent users from losing access to our application, we would like to have more than one instance of our application at the same time (Fault Tolerance)
High Availability: Replication Controller allows us to be able to run multiple instances of our application at the same time
Can we not use a replication controller if we have a single pod? → No
- Even if we have a single pod, in case that pod fails, the replication controller will help us bring a new pod automatically
Load Balancing & Scaling: We need a Replication Controller to run multiple pods to share the load across them
Eg: If no. of users accessing the app increases, the number of pods will increase. If users further increase and we run out of node space, then the replication controller allows us to run across multiple nodes with multiple pods
Replication controller
- It is the older technology that is being replaced by ReplicaSet
ReplicaSet

new recommended way to setup replication

rc-definition.yaml

  apiVersion: v1
  kind: ReplicationController
  metadata:
      name: myapp-src
      labels:
          app: myapp
          type: front-end
  spec:
      template: # pod template
          metadata:
              name: myapp-pod
              labels:
                  app: myapp
                  type: front-pod
          spec:
              containers:
                  - name: nginx-container
                      image: nginx
              replicas: 3

kubectl create -f rc-definition.yaml
kubectl get replicationcontroller
kubectl get replicaset
kubectl get pods

replicaset.yaml (selector is optional in replicationController but not here)

  apiVersion: apps/v1
  kind: ReplicaSet
  metadata:
      name: myapp-replicaset
      labels:
          app: myapp
          type: front-end
  spec:
      template:
          metadata:
              name: myapp-pod
              labels:
                  app: myapp
                  type: front-pod
          spec:
              containers:
                  - name: nginx-container
                      image: nginx
              replicas: 3
              selector: # help to check what pods are under it as it can also take pods that are not created by this yaml file
                  matchLabels:
                      type: front-pod

Labels and Selectors

The role of the replicaset is to make sure we have a few replicas at any time in the system. In case any pod fails, it deploys a new one at the same time
ReplicaSet is in fact a process that monitors the pods
How does ReplicaSet know which pod to monitor
- Labelling works as a filter to query the pods that we want to monitor
If there are already pods created which we filter and monitor via replicaSet, then why do we need to defiine a template for the pod in the ReplicaSet?

So that in case the ReplicaSet wants to deploy a new pod, it has the information it needs to create one
How to update the replicas from a Replicaset
- Change the number of replicas in the YAML file and then apply

kubectl replace -f replicaset-definitio .yaml
kuebctl scale --replicas=6 -f replicaset-definition.yaml

Setting it via type and name ( this won’t change anything in the definition file

kubectl scale --replicas-6 replicaset myapp-replicaset

Automatically scaling based on load

  kubectl delete replicaset myapp-replicaset # Also deletes all underlying PODS

  kubectl replace -f replicaset-definition.yaml

Deployments

if u want to deploy your application in a production env. With many instances of this application, for obvious reasons
Whenever a new version of the builds is updated on the Docker registry, you would like to upgrade your instances seamlessly.
However, when u want to upgrade your instances, u don’t want to do them all at once, this may impact users accessing your application (Rolling update)
In case any of the update cause some issue in your instance, you would like to Rollback your changes.
Making multiple changes to your environment. You don’t want to apply changes immediately after the command is run; instead, you would like to apply a pause to your environment, make changes, and roll out the changes together
All of these capabilities are available in K8s Deployments
Deployment: Kubernetes Object that comes higher in the hierarchy
- Provides us with the capability to upgrade the underlying instance seamlessly using Rolling Updates (which allow for undo changes, pause and resume changes as required)

How do we create a deployment?

Create a Deployment file, the content of which will be exactly similar to that of a replica set except for the kind: Deployment

deployment-definition.yaml

  apiVersion: apps/v1
  kind: Deployment
  metadata:
      name: myapp-deployment
      labels:
          app: myapp
          type: front-end
  spec:
      template:
          metadata:
              name: myapp-pod
              labels:
                  app: myapp
                  type: front-pod
          spec:
              containers:
                  - name: nginx-container
                      image: nginx
              replicas: 3
              selector: # help to check what pods are under it as it can also take pods that are not created by this yaml file
                  matchLabels:
                      type: front-pod

    kubectl create -f deployment-definition.yaml

    kubectl get deployents

    kubectl get all # to see all the created resoruces at once

This creates a ReplicaSet, which in turn creates pods, so you can view them too

Services

K8s services enable communication between various components

It helps us connect applications together
services make it possible for. te frontend application to be made available to the user
It helps communication between backend and frontend pods and helps in connectivity to an external datasource
Services enable loose coupling between microservices in our Application
Service Type
- Nodeport: The service makes an internal port accessible on a port on the node
- Cluster IP: The service creates a virtual IP inside the cluster to enable communication between different services
- Load Balancer: It provisions a load balancer for our application in a supported cloud provider
Nodeport
- A service can help us by mapping a port on the node to a port on the pod
- There are 3 ports involved, a port on the Node, where the actual server is running ⇒ Target PORT
- port on the service itself ⇒ PORT
  - These terms are from the viewpoint of the service
- Service → is like a virtual server inside the node
- Inside the cluster, it has its own IP address, and that IP address is called the ClusterIP of the service
- And finally, we have the port on the node itself, which we use to access the web server externally ⇒ Node PORT
- Nodeport can only be in a valid range, which by default is from 30,000 to 32,767
How to create a service?
- service-definition.yaml
  - If you don’t provide a target port, it will be assumed same as the port
  - If you don’t provide a nodeport, a free value between the range will be allotted
  - You have multiple port mappings within a service, as ‘ports’ is an array

        apiVersion: v1
        kind: Service
        metadata:
            name: myapp-service
        spec:
            type: Nodeport
            ports:
                - targetPort: 80
                        port: 80 # port on service object
                        nodePort: 30008
            selector:
                app: myapp
                type: frontend # took frmthe pod we want to catch

kubectl create -f service-definition.yaml

kubetctl get services

# when service is created it looks for matching pod with the said label, it then selects all pods as endpoints to forward external traffic to

# it uses.a random alogrith mto select the pod to send hte request on

If pods a distributed across multiple nodes?
- In this case, we have a web application on pods on separate nodes in a cluster
When we create a service, without us having to do any additional configuration, Kubernetes automatically creates a service that spans across all nodes in the cluster and maps the target port to the same node port on all the nodes
This way, you can access your application using the IP of any node in the cluster and using the same port number, which in this case is 30,008
To summarize, in any case, whether it be a single Pod on a single node, multiple Pods on a single node, or multiple Pods on multiple nodes, the service is created exactly the same, without you having to do any additional steps during the service creation.
When Pods are removed or added, the service is automatically updated, making it highly flexible and adaptive.
Once created, you won't typically have to make any additional configuration changes.

Services → Cluster IP

A full-stack application has frontend, backend, db , and datastore pods; they all need to communicate with each other
What is the best way to do so?
Pods have IP addresses assigned to them, but these IPs, as we know, are not static
What if one pod IP needs to connect to the backend service? Which pod would it go to? And who makes that decision?
A k8s service can help us group the pods together and provide a single interface to access the pods
- The requests are forwarded to one of the pods under the service randomly
- This enables us to easily & effectively deploy a microservices-based application on k8s cluster
- Each layer can now scale or move as required without impacting communication
Each service gets an IP and name assigned to it inside the cluster, and that is the name that other pods should use to access the service ⇒ CLUSTER IP

service-definition.yaml

  apiVersion: v1
  kind: Service
  metadata:
      name: back-end
  spec:
      type: ClusterIP # default type
      ports:
      - targetPort: 80 # backend is exposed
          port: 80 # service is exposed
      selector:
          app: myapp
          type: back-end

Services → Load Balancer

The services with type Nodeport help in receiving traffic on the ports on the nodes and routing the traffic to the respective ports

But what URL would you give your end users to access the application (you have IPs and port combinations on each pod)
One way to achieve this is to create. new VM for load balancer purpose and install a suitable load balancer on it, like HA proxy or Nginx, then configure the load balancer to route traffic to the underlying nodes
Another method is using the native load balancers of a supported cloud platform, as Kubernetes has support for integrating with the native load balancers of certain cloud providers in configuring that for us
Set the service type to LoadBalancer instead of NodePort
Remember, this only works with supported cloud platforms: GCP, AWS, Azure
For an unsupported environment, it would work exactly like NodePort, where the services are exposed to the high-end port of the nodes

Namespaces

Whatever we do in k8s, we do in a namespace (house)
If we don’t create a namespace, a namespace gets created automatically, “default”. When the cluster is first set up
K8s creates a set of pods and services for internal purposes, such as those required by the network solution, the DNS service, etc.
- To isolate these from the user and to prevent you from accidentally deleting or modifying these services, it creates them under the name-space “kube-system” → also created at cluster startup
Another namespace is kube-public, created by k8s, this is where resources that should be made available to all users are created
You can create your own Ns as well
Each of these ns can have its own set of policies, which define who can do what
- You can also assign a quota of resources to each of these namespaces, that way each Namespace is guaranteed a certain amount and does not use more than its allowed limit
The resources within a namespace can refer to each other simply by using their name
If required, to reach a resource in another namespace, you must append the name of ns to the name of the resource
- Eg→ servicename.namespace.svc.cluster.local
You are able to do this because when a service is created, a DNS entry is added in this format
cluster.local → default domain name of K8s cluster
svc → subdomain of service

kubectl get pods # list pods in default ns
kubectl get pods -n dev # list pods in dev ns
kubectl get pods -namespace=kube-syste
kubectl create -f pod-definition.yml --namespace=dev
# create pod in ns = dev
# you can also add namespace: dev under metadata of pod-definition.yml
kubectl create namespace dev
# to go into another namespace so u don't have to specify ns with each command use this
kubectl config set-context $(kubectl config current-context) --namespace=dev
kubectl get pods --all-namespaces # list all pods in all namespace

namespace-def.yml

  apiVersion: v1
  kind: Namespace
  metadata:
      name: dev

Imperative vs Declarative

Specifying what to do and how to do it is an Imperative approach
Specifying the final destination without going over any step-by-step instructions, the system figures out the right path (specifying what to do, not how to do) is the Declarative Approach
In Kubernetes, this is as follows: there are 2 ways to deploy k8s
- Imperatively → with many kubectl commands
  - good for learning and interactive experimentation
- kubectl edit pod <pod-name> → make changes in the k8s memory object
  - Make changes in the pod-definition file and then perform kubectl replace -f nginx.yml
- Declaratively → by writing manifests and using ‘kubectl apply`
  - Latter is good for reproducible deployments
  - In this approach, instead of creating or replacing the object, we use the kubectl apply command to manage the object
  - This command is intelligent enough to create an object if it doesn’t exist, and if there are multiple object config files as you would usually, then you may specify it directly as the path instead
  - That way, all the objects are created at once
  - If the object exists, make updates to the object

resource-quota.yml

  apiVersion: v1
  kind: ResourceQuota
  metadata:
      name: compute-quota
      namespace: dev
  spec:
      hard:
          pods: "10"
          requests:
              cpu: "4"
              memory: 5Gi
          limits:
              cpu: "10"
              memory: "10Gi

kubectl Apply

The apply command takes into consideration the local configuration file, the live object definition on K8s, and the last applied configuration before making a decision on what changes are to be made
So, when you run the apply command
- If the object doesn’t exist, it gets created.
- When an object is created, an object configuration, similar to what we created locally, is created within Kubernetes → with additional fields to store the status of the object → live configuration of the object on the k8s cluster
When you run a kubectl apply command, the YAML version of the local object configuration file we wrote is converted to a JSON format, and it is then stored as the last applied configuration.
Going forward, for any updates to the object, all three are compared to identify what changes are to be made to the live object
Once I make changes, → run kubectl apply → live configuration is updated, and then last applied configuration (JSON one) is updated
Why do we need the last applied configuration?
- If a field is deleted, and now we run the kubectl apply command, we see the last applied configuration had that field → meaning the field needs to be removed from the live configuration
- The last applied configuration helps us figure out what fields have been removed from the local file
We know that the local file is stored on our system, the live configuration is stored on Kubernetes memory, and the Last applied configuration (JSON one) is stored in the live configuration itself under the annotation: kubectl.kubernetes.io/last-applied configuration
. Only Kubernetes apply does this

Learning Kubernetes: Week 1 - Core Concepts & cluster Architecture

Table of contents