Introduction:

The Horizontal Pod Autoscaler(HPA) is a key feature in kubernetes which automatically adjust number of replicas in pod deployment, replica set or replication controller based on custom metrics defined while applying HPA. It will help us to adjust the number of pod based on the usage of the application at given time.

Why we need Horizontal Pod Autoscaler(HPA) ?

In today's modern world application experience fluctuating workloads. Suppose a e-commerce website might get huge load during sale so that time based on load HPA will increase the number of Pods and when the sale ends load will decrease so it will decrease the pods based on CPU usage.

Step-by-Step process to Enable HPA:

Step-1(Enable Metric Server):

HPA relies on metrics generated by kubernetes metric server so first we need to enable it by installing and applying it on kubernetes cluster.

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Step-2: (Create a Kubernetes Deployment file named deployment.yaml by using below code).

apiVersion: apps/v1
kind: Deployment 
metadata:
  name: hpa-deployment
  labels:
    app: microservice
spec:
  replicas: 1
  selector:
    matchLabels:
      app: microservice
  template:
    metadata:
      labels:
        app: microservice
    spec:
      containers:
      - name: microservice 
        image: apache2
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: "100m"
            memory: "200Mi"
          limits:
            cpu: "200m"
            memory: "400Mi"

Step-3: (Applying the Deployment).

In this step we will apply this deployment using below command

kubectl apply -f deployment.yaml

Step-4: (Expose the deployment).

In this step we will create service file service.yaml to expose our deployment.

apiVersion: v1
kind: Service
metadata:
  name: hpa-service
spec:
  type: NodePort
  selector:
    app: microservice
  ports:
  - protocol: TCP
    port: 80
    TargetPort: 80
    NodePort: 30007

Step-5: (Apply the service using below command).

kubectl apply -f service.yaml

Step-6: (Create the HPA using below command).

kubectl autoscale deployment microservice-deployment --cpu-percent=20 --min=1 --max=10

Here we have defined cpu percentage as 20% whenever cpu percentage exceeds this limit it will start creating pods, initially we have minimum of 1 replica and depending on cpu percentage it will create maximum of 10 pods.

Step-7: (Monitor the HPA).

Now we can see the creation of new pods in real time based on cpu metrics just we need to increase the load of our running container by using below command.

watch kubectl get all

Above command will show status of all the kubernetes object in real time and it refreshes after every 2 second so open this in one tab whenever load increases it will show in real time how the pods are increasing.

kubectl exec hpa-deployment -it -- /bin/bash

Above command will let us into the running container bash shell where we need to type below command again and again to increase the load and simultaneously we need to monitor the status of all the kubernetes object in the other screen.

apt-get update

After typing above command several times it will increase the load above 20% and you will see the increase in the pod and when we will stop applying loads on container it will decrease the load and after waiting for few minutes it will again start terminating the pods.

Automating Scaling with Kubernetes Horizontal Pod Autoscaler(HPA)

Subscribe to my newsletter

Gaurav Kumar

Gaurav Kumar