K8s HorizontalPodAutoscaler (HPA)

TECH-NOTESTECH-NOTES
3 min read

In this blog, I want to show you how to scale Kubernetes pods with Autoscaler. Horizontal scaling means that the response to increased load is to deploy more pods. If the load increases, the number of pods is scaled out to the maximum, and if the load decreases, the number of pods is scaled down to the minimum.

Before you begin

You need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. It is recommended to run this tutorial on a cluster with at least two nodes that are not acting as control plane hosts

  1. Setup Kubernetes cluster

  2. Deploy and configure Metrics Server to collect resources metrics

Create example deployment and service

To demonstrate a horizontal pod scaler, you will first start a Deployment that runs a container using the hpa-example image, and expose it as a service using the following manifest.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: php-apache
spec:
  selector:
    matchLabels:
      run: php-apache
  template:
    metadata:
      labels:
        run: php-apache
    spec:
      containers:
      - name: php-apache
        image: registry.k8s.io/hpa-example
        ports:
        - containerPort: 80
        resources:
          limits:
            cpu: 500m
          requests:
            cpu: 200m
---
apiVersion: v1
kind: Service
metadata:
  name: php-apache
  labels:
    run: php-apache
spec:
  ports:
  - port: 80
  selector:
    run: php-apache

To deploy, run the following command:

kubectl apply -f https://k8s.io/examples/application/php-apache.yaml

After creating the pod, check the pod status with the following command:

kubectl get pod  #To check the pod is running or not 
kubectl get svc  #To check the services
NAME                          READY   STATUS    RESTARTS      AGE
php-apache-7495ff8f5b-xpcxx   1/1     Running   1 (97s ago)   23h
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1       <none>        443/TCP   6d16h
php-apache   ClusterIP   10.110.24.116   <none>        80/TCP    23h

The pod and service are ready; we need to deploy the horizontalPodAutoscaler with the kubectl command.

kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=3

You can check the current status of the newly-made HorizontalPodAutoscaler, run the following command:

kubectl get hpa
NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   0%/50%    1         3         1          22h

When we check the HPA, the current CPU usage is 0% and the current replica count is one. Now we need to add some load to the pod; run the following command.

kubectl run -i --tty load-generator --rm --image=busybox:1.28 --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"

Within a minute, you should see the higher CPU load; for example:

NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   171%/50%   1         3         3         23h

Once CPU utilization increases to 171, the HPA automatically scales the number of replicas up to 3. When we check the pods, we should see the following output:

NAME                          READY   STATUS    RESTARTS      AGE
php-apache-7495ff8f5b-dd7n2   1/1     Running   0             3m13s
php-apache-7495ff8f5b-wjp9r   1/1     Running   0             2m43s
php-apache-7495ff8f5b-xpcxx   1/1     Running   1 (16m ago)   23h

When we stop the load test. Once CPU utilization dropped to 0, the HPA automatically scales down the number to 1. When we check the pods and the hpa, we should see the following output:

kubectl get hpa
NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   0%/50%    1         3         1          23h

kubectl get pod
NAME                          READY   STATUS    RESTARTS      AGE
php-apache-7495ff8f5b-xpcxx   1/1     Running   1 (20m ago)   23h

Autoscaling the replicas may take a few minutes.

I have mentioned one method that I know. If you know others, share in the comment for everyone!

Thanks For Reading, Follow Me For More

Have a great day!..

0
Subscribe to my newsletter

Read articles from TECH-NOTES directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

TECH-NOTES
TECH-NOTES

I'm a cloud-native enthusiast and tech blogger, sharing insights on Kubernetes, AWS, CI/CD, and Linux across my blog and Facebook page. Passionate about modern infrastructure and microservices, I aim to help others understand and leverage cloud-native technologies for scalable, efficient solutions.