K8s HorizontalPodAutoscaler (HPA)
In this blog, I want to show you how to scale Kubernetes pods with Autoscaler. Horizontal scaling means that the response to increased load is to deploy more pods. If the load increases, the number of pods is scaled out to the maximum, and if the load decreases, the number of pods is scaled down to the minimum.
Before you begin
You need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. It is recommended to run this tutorial on a cluster with at least two nodes that are not acting as control plane hosts
Setup Kubernetes cluster
Deploy and configure Metrics Server to collect resources metrics
Create example deployment and service
To demonstrate a horizontal pod scaler, you will first start a Deployment that runs a container using the hpa-example
image, and expose it as a service using the following manifest.
apiVersion: apps/v1
kind: Deployment
metadata:
name: php-apache
spec:
selector:
matchLabels:
run: php-apache
template:
metadata:
labels:
run: php-apache
spec:
containers:
- name: php-apache
image: registry.k8s.io/hpa-example
ports:
- containerPort: 80
resources:
limits:
cpu: 500m
requests:
cpu: 200m
---
apiVersion: v1
kind: Service
metadata:
name: php-apache
labels:
run: php-apache
spec:
ports:
- port: 80
selector:
run: php-apache
To deploy, run the following command:
kubectl apply -f https://k8s.io/examples/application/php-apache.yaml
After creating the pod, check the pod status with the following command:
kubectl get pod #To check the pod is running or not
kubectl get svc #To check the services
NAME READY STATUS RESTARTS AGE
php-apache-7495ff8f5b-xpcxx 1/1 Running 1 (97s ago) 23h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 6d16h
php-apache ClusterIP 10.110.24.116 <none> 80/TCP 23h
The pod and service are ready; we need to deploy the horizontalPodAutoscaler with the kubectl command.
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=3
You can check the current status of the newly-made HorizontalPodAutoscaler, run the following command:
kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
php-apache Deployment/php-apache 0%/50% 1 3 1 22h
When we check the HPA, the current CPU usage is 0% and the current replica count is one. Now we need to add some load to the pod; run the following command.
kubectl run -i --tty load-generator --rm --image=busybox:1.28 --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"
Within a minute, you should see the higher CPU load; for example:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
php-apache Deployment/php-apache 171%/50% 1 3 3 23h
Once CPU utilization increases to 171, the HPA automatically scales the number of replicas up to 3. When we check the pods, we should see the following output:
NAME READY STATUS RESTARTS AGE
php-apache-7495ff8f5b-dd7n2 1/1 Running 0 3m13s
php-apache-7495ff8f5b-wjp9r 1/1 Running 0 2m43s
php-apache-7495ff8f5b-xpcxx 1/1 Running 1 (16m ago) 23h
When we stop the load test. Once CPU utilization dropped to 0, the HPA automatically scales down the number to 1. When we check the pods and the hpa, we should see the following output:
kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
php-apache Deployment/php-apache 0%/50% 1 3 1 23h
kubectl get pod
NAME READY STATUS RESTARTS AGE
php-apache-7495ff8f5b-xpcxx 1/1 Running 1 (20m ago) 23h
Autoscaling the replicas may take a few minutes.
I have mentioned one method that I know. If you know others, share in the comment for everyone!
Thanks For Reading, Follow Me For More
Have a great day!..
Subscribe to my newsletter
Read articles from TECH-NOTES directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
TECH-NOTES
TECH-NOTES
I'm a cloud-native enthusiast and tech blogger, sharing insights on Kubernetes, AWS, CI/CD, and Linux across my blog and Facebook page. Passionate about modern infrastructure and microservices, I aim to help others understand and leverage cloud-native technologies for scalable, efficient solutions.