Kubernetes Cluster Upgrade: Achieving Zero Downtime in Production


Kubernetes releases a new version every three months, and it only supports the latest three versions at any given time. Staying up to date is essential for security, stability, and new feature enhancements.
๐ Latest Kubernetes Versions: Kubernetes Releases
As of March 2025, the latest Kubernetes version is 1.32, meaning Kubernetes officially supports 1.32, 1.31, and 1.30.
In this article, letโs explore how to upgrade an Amazon EKS cluster (AWS Managed Kubernetes) while ensuring zero downtime in production.
Key Considerations Before Upgrading
๐น Cordon Nodes โ Prevent new workloads from being scheduled on nodes during the upgrade.
๐น Communicate with the Application Team โ Ensure they avoid deployments during the upgrade.
๐น Review Kubernetes Release Notes & API Changes โ If deprecated APIs arenโt updated, workloads may fail post-upgrade.
๐น No Downgrade Option โ AWS does not support rolling back an upgrade, so proceed with caution.
๐น Ensure Version Consistency โ The Control Plane, Node Groups, Kubelet, and Cluster Autoscaler must be on the same version.
๐น Upgrade Lower Environments First โ Test the upgrade in Dev/Staging, monitor for at least a week, then proceed to Production.
Step-by-Step EKS Upgrade Process
1๏ธโฃ Upgrade the Control Plane (โณ ~30 mins)
The EKS Control Plane does not upgrade automatically.
Upgrade using AWS Console or eksctl.
AWS manages the Control Plane HA, DR, Security, and API Requests, but the upgrade needs manual intervention.
2๏ธโฃ Upgrade Node Groups / Nodes / Fargate (โณ 2-3 Hours, Depending on Node Count)
Managed Node Groups follow a rolling update (one node at a time).
If using Custom Launch Templates or Custom AMIs, you must update them manually.
Ensure new nodes have the same labels and taints as older ones to avoid scheduling issues.
3๏ธโฃ Upgrade Kubernetes Add-ons (VPC CNI, Kube-Proxy, CoreDNS, etc.)
Add-ons ensure networking, DNS resolution, and API communication function properly.
Use
eksctl utils update-cluster
to update EKS-managed add-ons.
How to Test the Upgrade?
โ๏ธ Run Functional Tests โ Validate application performance and stability.
โ๏ธ Verify Logging & Monitoring โ Check CloudWatch, Prometheus, and other monitoring solutions.
โ๏ธ Confirm Autoscaling & Networking โ Ensure pods are scaling correctly and network policies work as expected.
โ๏ธ Test Rollbacks (If Needed) โ If issues arise, be prepared to revert workloads to the previous version.
Best Practices for EKS Upgrades
โ
Always upgrade Control Plane โ Node Groups โ Add-ons in order.
โ
Keep lower environments (Dev/Staging) at least one week ahead of Production.
โ
Validate ingress controllers, network policies, and storage compatibility before upgrading.
โ
Upgrade during a planned maintenance window to avoid unexpected disruptions.
Refer AWS Documentation: https://docs.aws.amazon.com/eks/latest/userguide/update-cluster.html
https://gist.github.com/iam-veeramalla/7e32999189f4aa9064334d1d27bd877c
Upgrading Kubernetes is a critical but manageable process with proper planning. Have you recently upgraded your EKS cluster? Share your insights and challenges in the comments! ๐
#AWS #Kubernetes #EKS #Cloud #DevOps #K8sUpgrade #KubernetesUpgrades
Subscribe to my newsletter
Read articles from Mahesh Velicheti directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Mahesh Velicheti
Mahesh Velicheti
With over 9 years of experience in the IT industry, I have built a career centered on driving innovation in DevOps and Cloud Engineering. My journey with Tata Consultancy Services was marked by delivering cutting-edge automation solutions and enhancing cloud service delivery, leveraging tools like Terraform and other infrastructure-as-code technologies. Through strategic process optimization, I contributed to elevating operational efficiency, streamlining workflows, and achieving consistent service excellence.