Kubernetes Explained: From Code to Cloud

Moni MKMoni MK
15 min read

Why Kubernetes? (A Real-World Problem And K8’s Origin)

Let’s rewind to around 2014, at Google.

Google had a problem that many modern companies now face: They were running thousands (millions!) of applications across data centers all over the world. Every app was split into smaller parts (now known as microservices). They needed to deploy microservices consistently scale them up/down during peak traffic automatically restart crashed services roll out new features without breaking users ensure services could talk to each other securely and efficiently.

But It was chaos trying to do all this manually or with scripts. Google solved this internally with a system called Borg, their in-house container orchestration tool.

But they realized every other company was also starting to break monoliths into microservices and would soon face the same chaos. So, with its big heart, they open-sourced a new, more flexible version of Borg. That’s how Kubernetes was born https://github.com/kubernetes/kubernetes

How Is a Typical App Packaged and Deployed?

We all know that, you write a code, say an Spring Boot app. Build the app into a .jar or .zip file

Traditionally way copy it to a server. Install Java/Node manually. Start it using system commands or scripts. Monitor and restart it manually if it crashes 🤦‍♂️ Sounds fragile, right?

🚚 Enter Docker, Yeah our Savior, our Messiah.

With Docker: We describe in docker file in a very few steps with its base image always describing our app, its runtime (like Java), and how to run it. We build an image with everything inside (OS + app + dependencies) you run that image as a container on any system with Docker

FROM openjdk:17
COPY target/myapp.jar app.jar
ENTRYPOINT ["java", "-jar", "app.jar"]

Now your app runs the same on your laptop, Dev, QA, or production — no more " it works on my machine" problems!

Why Docker Alone Isn’t Enough (Why Kubernetes?)

Docker helps us package and run an app. But when we have to = go big and have: 10+ microservices 100+ containers 3 cloud regions Auto-scaling needs. We need more than just Docker. We need orchestration. We need management not only from IIM graduate, but tool like K8’s.

Say for all this TaskDocker AloneKubernetes
Run multiple containersManually using scripts✅ Declarative + scalable
Auto-restart crashed app❌ Manual✅ Auto healing
Load balancing❌ Not built-in✅ Via Services + Ingress
Rolling updates❌ Hard to do safely✅ Built-in support
Scaling up/down❌ Needs scripting✅ Horizontal autoscaling
Secret/config mngmtBasic✅ Secure & integrated

✨ Docker is the shipping container, Kubernetes is the port and logistics system that moves and manages them.

Kubernetes Architecture-

More details in the official documentation- https://kubernetes.io/docs/concepts/architecture/

Basically in any physical/virtual server cluster , where we have all this above defined component then its nothing but K8.

✨ Control Plane (Master node)

  • API Server: Entry point for all kubectl, UI, or CI/CD commands. It validates and processes REST requests and communicates with etcd.

  • Scheduler: Decides which Node a Pod should run on, based on available resources and policies.

  • Controller Manager: Watches the cluster state and makes changes to move toward the desired state (replicas, jobs, node health, etc.).

  • etcd: A consistent, distributed key-value store for storing all cluster data like Pod specs, ConfigMaps, Secrets, etc.

🏥 Nodes (Worker node. By default min 2 node is needed)

  • Kubelet: Receives instructions from the API Server and manages Pods on that node. It reports status back.

  • Container Runtime: Actually pulls and runs containers (e.g., Docker, containerd).

  • Kube Proxy: Handles network routing on each node using iptables/IPVS to allow communication between Services and Pods. kube-proxy is the daemonset that runs on every node in a Kubernetes cluster and iptables(important) is the underlying kernel level feature and userspace tool that kube-proxy uses to do its job.

    All this above term is just a high level term, But behind all this term a lot is happening so do your own research and deep dive for better understanding.

    For instance, all the control plane component is also nothing but pods, but who manages it if its pod in master node in what namespaces ???

    Service-to-Service Communication and Networking in K8’s

    Not IP, because every time your pods(more than 1 container eg- your app plus init container etc) crashes/restarts your pods will changes its IP. For this nature ip managing for the same pod is nightmare, now imagine 100’s of pods(cry for help). Hence, Kubernetes uses virtual networking so every Pod can talk to every other Pod (within a Namespace) using its IP or Service name.

    • Each Pod gets its own IP in a flat network

    • Services provide stable DNS names and IPs for groups of Pods

    • DNS entries are auto-generated by Kubernetes

    • Communication is handled by Kube Proxy using ip tables and CNI (Container Network Interface) plugins.

      Now imagine I have an apps called ‘merijaan’ in name space ‘tukahahai’ then my service url for internal cluster communication will look like in this format-

      <clustername>.<namespacename>.svc.cluster.local

      my app url will be http://merijaan.tukahahai.svc.cluster.local

Namespaces in Kubernetes-

Namespaces are like folders or environments (dev, staging, prod) inside your cluster. It is a simple virtual independent separation for pods deployment to support

  • Isolation

  • Resource quotas

  • Access control

      kubectl get services -n prod
      kubectl get pods -n dev
    

Secrets and Security in Kubernetes

Secrets are Kubernetes objects used to store sensitive information such as:

  • API keys, Passwords, TLS certificates

They are Base64 encoded by default (not encrypted unless configured)and mounted into Pods as environment variables or files

Security Measures are-

  • RBAC (Role-Based Access Control): Limit who can do what on the cluster

  • Namespaces: Provide isolation between teams or environments

  • Network Policies: Restrict traffic between Pods

  • Pod Security Context: Run containers as non-root

  • TLS everywhere: Kubernetes supports HTTPS communication within the cluster

Auto scaling and Health Checks-

Based on traffic, the pods will be scaled up/down automatically if configured for auto scaling in the configs. We have 2 types of HPA in K8’s

Horizontal Pod Autoscaler (HPA)- Here the core idea is to automatically adjusts the number of pods based on CPU/memory usage without restarting or having downtime in a horizontal basis. Ideal for stateless applications(no remembrance of previous records) or applications that are designed to be scaled out by running multiple, identical instances. Good for peak hour traffic for application say Big billion days, Diwali sales etc where downtime is a Big Noooooo.

Vertical Pod Autoscaler(VPA)- Unlike HPA this is less popular and it worked as a add on, not part of the core kubernetes like HPA, here also the core idea is same as both type use the resource metrics(cpu/mem) to make scaling decisions. However, they scale in fundamentally different ways as the name suggests. Perfect for applications that cannot be easily scaled horizontally (e.g., stateful applications with a single instance) or to ensure that each pod is "right-sized" to prevent resource waste or performance issues. It is often best to get concrete initial, optimized resource recommendations before you set up HPA. In Auto mode, it can automatically evict and recreate pods with the new, recommended resource settings. This is a crucial point- it cannot change the resources of a running pod. To apply a change, it must terminate the pod and create a new one with the updated values. Meaning no multiple pods, it will adjust the resources of the same pods, hence restarts required and there will be downtime because of this.

Liveness and Readiness Probes-

Probe is used to check if a container is healthy and ready to serve traffic, that is do not accept the user/client request before container is ready, thus avoiding 503 not found or any http error responses to the client. So for this what Kubernetes does is they used probe object to internally check the container status. The kubelet (the agent that runs on each node) uses to determine the state of a container. They are essential for building resilient and self-healing applications in a distributed environment.

There are three main types of probes-

1. Liveness Probe- A liveness probe determines if a container is still alive and functioning correctly. It's designed to catch situations where an application is running but is in a broken or unresponsive state (e.g., a deadlock, an infinite loop, or a memory leak that has made it unresponsive). Action on failure is just like our IT team solution- ‘restarts your system’, If a liveness probe fails, Kubernetes's response is to kill and restart the container.

2. Readiness Probe- A readiness probe determines if a container is ready to receive network traffic. This is crucial for applications that require a certain amount of time to start up, load data, or connect to external services before they can handle requests. Action on failure is if a readiness probe fails, Kubernetes will remove the pod's IP address from all matching Service endpoints. This means that no new traffic will be routed to that pod until the probe succeeds again.

3. Startup Probe- A startup probe is specifically for applications that have a long or unpredictable startup time say legacy system or a system which needs to load a large configuration file, connect to a database, and warm up its cache before it can be considered truly "healthy“. Action on failure if a startup probe fails, Kubernetes will restart the container same as liveness Probe. Wait you might have a question if both probe 1 and 3’s actions is to kill/restarts what is the difference? — While both can lead to a container restart, they address different phases of a container's lifecycle. Startup probes is to confirm the container has successfully started and is no longer in its initialization phase. Then only the other two probes will be enabled.

Deployment lesson 101 - Kubernetes is so huge and powerful, it cannot be described in single article. Out of so many component. Why I have chosen probes field to discuss is because of the problem I faced while deploying my application, it took me a week to realize this core issue.

My situation- I only defined my liveness probe similar to my other application which was lightweight and with less startup time. Now this application was a core layer, hence it required many pre requisites like you know creating 20 + kafka topic and its corresponding DLT topic and connecting to MSK, Aurora DB, Redis cluster etc. In short the start-up time was higher unlike other. But But But, it was not starting at all. In my local everything was smooth, Apps was running like Ronaldo. But at environment it was not running at all as if it was a goal keeper.

Now If you only use a Liveness Probe like me for heavy apps:

    • Say you set initialDelaySeconds to something like 30 seconds to give the app a little time to breathe.

      • But after 30 seconds, your application is still starting up.

      • The liveness probe fires a http request to test the pod status, it fails (because the app isn't ready to respond to its health check endpoint), and Kubernetes restarts the container as per liveness probe rule.

      • This creates an endless restart loop. The application never gets enough time to fully start, so it's constantly being killed and relaunched.

This is the exact problem that the startupProbe was introduced to solve. Its primary function: To tell Kubernetes, "Hey, this container is still in its initialization phase. Don't worry if it's not responding to health checks yet. Just give it time."

Init Containers and Sidecars-

We know inside pods we can have multiple containers- Your app container and say some helper containers. One of those helping heroes are init/sidecars containers.

Init Containers- Its sole purpose is to perform one-time setup and initialization tasks for the main application container(s) before they start say like DB connection, topic creation, migration, other dependencies setup. They run sequentially in the order they are defined in the Pod manifest. Each init container must run to completion successfully (exit with code 0) before the next one starts. Only after its successful completion then only main app container will start.

Sidecar Containers- As the name suggest it will run side by side alongside main container throughout its container lifecycle like a shadow. They have the same lifecycle as the main container, they start and stop together. If a sidecar container crashes, Kubernetes will restart it, but the main container remains unaffected. Logging and Monitoring: A sidecar can tail a log file from the main container's shared volume and ship those logs to a centralized logging system (e.g., Fluentd, Logstash). Proxies and Service Meshes: A sidecar can act as a proxy for all inbound and outbound network traffic, handling tasks like traffic routing, authentication (mTLS), and telemetry collection (this is the core pattern used by service meshes like Istio and Linkerd).

Taints, Tolerations, and Node Affinity-

Kubernetes gives you tools to control where Pods get scheduled.

🚫 Taints and Tolerations

  • Taints: Mark a Node to repel unwanted Pods, Basically reserved for specific pods only.

  • Tolerations: Allow specific Pods to be scheduled on tainted Nodes

      kubectl taint nodes node1 key=value:NoSchedule
    
spec:
  tolerations:
  - key: "key"
    operator: "Equal"
    value: "value"
    effect: "NoSchedule"

Node Affinity

Control Pod placement using labels and rules. In Kubernetes, affinity is a powerful feature that gives you more control over the scheduling of your pods, allowing you to influence which nodes they are placed on. It's a more expressive and flexible alternative to the simpler nodeSelector.

The basic idea of affinity is to define rules that "attract" pods to certain nodes or to other pods. It works by matching labels on nodes and/or other pods. Say for eg- ensures Pods are only scheduled on Nodes in the specified zones.

spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/e2e-az-name
            operator: In
            values:
            - us-west1-a
            - us-west1-b

Ingress and Load Balancing

This is a fundamental and often confusing topic in Kubernetes networking. The key to understanding the difference is to realize that a Load Balancer is a type of service that operates at Layer 4 of the OSI model, while Ingress is a set of rules that operates at Layer 7 and is implemented by an Ingress Controller.

  • How Load balancer works is: When you create a Service of type LoadBalancer, Kubernetes works with your cloud provider (AWS, GCP, Azure, etc.) to automatically provision a cloud-native load balancer via CCM component from the control panel. It operates at Layer 4 (Network Layer) of the OSI model. This means it forwards traffic based on IP address and port, without inspecting the content of the request.

    Features: It's a straightforward way to expose a single service directly to the internet.

    • Protocols: It can handle various protocols like TCP, UDP, HTTP, and HTTPS.

    • Cost: A significant drawback is that each LoadBalancer service typically gets its own dedicated public IP address and provisions a separate cloud load balancer, which can become expensive if you have many services to expose. Also restricts control as per cloud provider.

      Example: You have a single web application with multiple pods. You create a LoadBalancer service that exposes the application on port 80. The cloud provider's load balancer gets a public IP and routes all incoming traffic on that IP to one of the pods behind the service.

Ingress( i.e Incoming )- it is a more advanced and flexible way to expose services. It's not a service type itself, but a Kubernetes API object that defines a set of routing rules.

  • How it works: An Ingress object is a declarative way to define how external HTTP/HTTPS traffic should be routed to different services within the cluster. It requires a dedicated Pod called an Ingress Controller to be running in your cluster to actually implement these rules. Popular Ingress Controllers include NGINX, Traefik, and Istio. It operates at Layer 7 (Application Layer) of the OSI model. This allows it to make "smarter" routing decisions based on the content of the request, such as the host header or the URL path, basically more customized control

  • Features: A single Ingress Controller (and therefore a single external IP address) can be used to route traffic to multiple different services. This is a major cost-saver.

    • Advanced Routing: It enables features like:

      • Host-based routing: www.myapp.com goes to one service, while api.myapp.com goes to another.

      • Path-based routing: www.myapp.com/api goes to the API service, while www.myapp.com/blog goes to the blog service

        Example: You have a blog service and an API service. You create an Ingress object with rules that say: Traffic for blog.yourcompany.com should go to the blog-service. and traffic for api.yourcompany.com should go to the api-service.

      • The Ingress Controller watches these rules, and a single external load balancer (often provisioned by the Ingress Controller itself as a LoadBalancer service) directs traffic to the correct backend service. { Preety cool yeah !!! }

        The most important thing to understand is that Ingress often uses a LoadBalancer Service to expose itself to the outside world. A common pattern is:

          Client -> Cloud Load Balancer (via LoadBalancer Service) -> Ingress Controller Pod -> Internal Service -> Pod
        

Use Ingress when you have multiple services that need to be exposed, and you want to use a single entry point with advanced routing, SSL termination, and cost-effectiveness. This is the standard and most powerful way to handle external access for production web applications.

🌟 Conclusion

Kubernetes isn't just hype — it's a game-changer for microservices, especially in the cloud. It abstracts the heavy lifting and lets your team ship fast, fail safe, and scale effortlessly.

As an SDE your solid understanding of K8’s is crucial, it is not just for SRE’s or DevOps. You're the Architect of the Application's Behavior and Kubernetes isn't just a deployment tool, it's a runtime environment. As a developer, the code you write, and the way you package it, directly influences how it will behave in a Kubernetes cluster. Therefore, knowing Kubernetes empowers developers to write better, more robust, and more efficient applications. It enables us to troubleshoot problems more quickly and fosters a collaborative culture that ultimately leads to more reliable and scalable software

So go ahead, give your apps a powerful home with Kubernetes. And let Ingress be the smart bouncer at the door.

You can't write cloud-native code without understanding the cloud's native language. Kubernetes is that language."

Happy deploying! 🚀

#Kubernetes #K8s #CloudNative #SDE #DevOps #SoftwareEngineering #Microservices #Containerization #Dockerised #Docker #kube-system #CNCF.

0
Subscribe to my newsletter

Read articles from Moni MK directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Moni MK
Moni MK

I am a software development Engineer, just sharing my learning which I face on my day to day development, and how the industry juice up the technology for the betterment of the consumer.