Automating Scaling with Kubernetes Horizontal Pod Autoscaler(HPA)
Introduction:
The Horizontal Pod Autoscaler(HPA) is a key feature in kubernetes which automatically adjust number of replicas in pod deployment, replica set or replication controller based on custom metrics defined while applying HPA. It will help us to adjust the number of pod based on the usage of the application at given time.
Why we need Horizontal Pod Autoscaler(HPA) ?
In today's modern world application experience fluctuating workloads. Suppose a e-commerce website might get huge load during sale so that time based on load HPA will increase the number of Pods and when the sale ends load will decrease so it will decrease the pods based on CPU usage.
Step-by-Step process to Enable HPA:
Step-1(Enable Metric Server):
HPA relies on metrics generated by kubernetes metric server so first we need to enable it by installing and applying it on kubernetes cluster.
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Step-2: (Create a Kubernetes Deployment file named deployment.yaml by using below code).
apiVersion: apps/v1
kind: Deployment
metadata:
name: hpa-deployment
labels:
app: microservice
spec:
replicas: 1
selector:
matchLabels:
app: microservice
template:
metadata:
labels:
app: microservice
spec:
containers:
- name: microservice
image: apache2
ports:
- containerPort: 80
resources:
requests:
cpu: "100m"
memory: "200Mi"
limits:
cpu: "200m"
memory: "400Mi"
Step-3: (Applying the Deployment).
In this step we will apply this deployment using below command
kubectl apply -f deployment.yaml
Step-4: (Expose the deployment).
In this step we will create service file service.yaml to expose our deployment.
apiVersion: v1
kind: Service
metadata:
name: hpa-service
spec:
type: NodePort
selector:
app: microservice
ports:
- protocol: TCP
port: 80
TargetPort: 80
NodePort: 30007
Step-5: (Apply the service using below command).
kubectl apply -f service.yaml
Step-6: (Create the HPA using below command).
kubectl autoscale deployment microservice-deployment --cpu-percent=20 --min=1 --max=10
Here we have defined cpu percentage as 20% whenever cpu percentage exceeds this limit it will start creating pods, initially we have minimum of 1 replica and depending on cpu percentage it will create maximum of 10 pods.
Step-7: (Monitor the HPA).
Now we can see the creation of new pods in real time based on cpu metrics just we need to increase the load of our running container by using below command.
watch kubectl get all
Above command will show status of all the kubernetes object in real time and it refreshes after every 2 second so open this in one tab whenever load increases it will show in real time how the pods are increasing.
kubectl exec hpa-deployment -it -- /bin/bash
Above command will let us into the running container bash shell where we need to type below command again and again to increase the load and simultaneously we need to monitor the status of all the kubernetes object in the other screen.
apt-get update
After typing above command several times it will increase the load above 20% and you will see the increase in the pod and when we will stop applying loads on container it will decrease the load and after waiting for few minutes it will again start terminating the pods.
Subscribe to my newsletter
Read articles from Gaurav Kumar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Gaurav Kumar
Gaurav Kumar
I am working as a full time DevOps Engineer at Tata Consultancy Services from past 2.7 yrs, I have very good experience of containerization tools Docker, Kubernetes, OpenShift. I have good experience of using Ansible, Terraform and others.