Deep dive into Kubernetes Cluster Maintenance

Subho DeySubho Dey
10 min read

Kubernetes clusters can be complex and mission-critical systems, which means that it's important to have a solid backup and restore plan in place.

Upgrading Kubeadm clusters :

Here is a bash script to upgrade a Kubernetes cluster using Kubeadm:

bashCopy code#!/bin/bash

# Check the current version of Kubernetes
echo "Current Kubernetes version:"
kubectl version

# Check the available versions of kubeadm
echo "Available versions of kubeadm:"
apt list -a kubeadm

# Upgrade kubeadm to the latest version
echo "Upgrading kubeadm..."
apt-get update && apt-get install -y kubeadm=<latest-version>

# Drain the nodes in the Kubernetes cluster
echo "Draining the nodes..."
for node in $(kubectl get nodes --no-headers | awk '{print $1}'); do
  kubectl drain $node --ignore-daemonsets
done

# Upgrade the control plane components
echo "Upgrading the control plane components..."
kubeadm upgrade apply <new-version>

# Upgrade the kubelet and kubectl components on each worker node
echo "Upgrading the kubelet and kubectl components on each worker node..."
for node in $(kubectl get nodes --no-headers | awk '{print $1}'); do
  ssh $node "apt-get update && apt-get install -y kubelet=<new-version> kubectl=<new-version>"
done

# Upgrade the control plane configuration
echo "Upgrading the control plane configuration..."
kubeadm upgrade node config --kubelet-version <new-version>

# Uncordon the nodes in the Kubernetes cluster
echo "Uncordoning the nodes..."
for node in $(kubectl get nodes --no-headers | awk '{print $1}'); do
  kubectl uncordon $node
done

# Verify that the upgrade was successful
echo "Verifying the upgrade..."
kubectl version

Note that you will need to replace <latest-version> and < new version> with the latest version of Kubeadm available and the version you want to upgrade to, respectively. You may also need to modify the SSH command to access the worker nodes in your cluster.

Backup and Restore a Kubernetes Cluster Using TrilioVault For Kubernetes:

TrilioVault for Kubernetes is a data protection and disaster recovery solution designed specifically for Kubernetes environments. It provides application-centric backup and restore capabilities, as well as the ability to migrate workloads across clusters.

Prerequisites

Before we get started, there are a few prerequisites you'll need to have in place:

  • A Kubernetes cluster running version 1.17 or later

  • Helm version 3 installed

  • Access to a TrilioVault for Kubernetes installation

  • A storage location where backups can be stored

Installing TrilioVault for Kubernetes

First, we need to install TrilioVault for Kubernetes. To do this, we'll use Helm.

  1. Add the Trilio repository to Helm:
helm repo add trilio https://charts.trilio.io/
  1. Update your local Helm chart repository:
helm repo update
  1. Install TrilioVault for Kubernetes:
helm install triliovault trilio/triliovault \
  --namespace triliovault \
  --create-namespace \
  --set credentials.username=<username> \
  --set credentials.password=<password> \
  --set global.deployment.envName=triliovault \
  --set backup.target=<backup-storage-target>

Replace <username> and <password> with the credentials for your TrilioVault installation, and <backup-storage-target> with the storage location where backups should be stored.

Accessing the TVK Management Console :

kubectl get svc -n tvk
kubectl port-forward svc/k8s-triliovault-ingress-nginx-controller 8080:80 -n tvk &

Creating a Backup

Now that TrilioVault for Kubernetes is installed, we can create a backup of our cluster. To do this, we'll use the tvctl command-line interface provided by TrilioVault.

  1. Install the tvctl command-line interface:
curl -s https://raw.githubusercontent.com/trilioData/tvctl/main/install.sh | bash
  1. Log in to TrilioVault:
tvctl login --user <username> --password <password> --url <triliovault-url>

Replace <username>, <password>, and <triliovault-url> with the appropriate values for your TrilioVault installation.

  1. Create a backup of the cluster:
tvctl backup create --cluster <cluster-name> --target <backup-storage-target>

Replace <cluster-name> with the name of your Kubernetes cluster, and <backup-storage-target> with the storage location where backups should be stored.

Restoring from a Backup

In the event of a disaster or data loss, we can use TrilioVault for Kubernetes to restore our cluster from a backup.

  1. Log in to TrilioVault:
tvctl login --user <username> --password <password> --url <triliovault-url>
  1. List available backups:
tvctl backup list --cluster <cluster-name>

Replace <cluster-name> with the name of your Kubernetes cluster.

  1. Restore the cluster from a backup:
tvctl backup restore --cluster <cluster-name> --backup <backup-name>

Replace <cluster-name> with the name of your Kubernetes cluster, and <backup-name> with the name of the backup you want to restore from.

Creating a TrilioVault Target to Store Backups

nano trilio-s3-target-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: trilio-s3-target-secret
  namespace: tvk
type: Opaque
stringData:
  accessKey: your_bucket_access_key
  secretKey: your_bucket_secret_key
kubectl apply -f trilio-s3-target-secret.yaml -n tvk
nano trilio-s3-target.yaml
apiVersion: triliovault.trilio.io/v1
kind: Target
metadata:
  name: trilio-s3-target
  namespace: tvk
spec:
  type: ObjectStore
  vendor: Other
  enableBrowsing: true
  objectStoreCredentials:
    bucketName: your_bucket_name
    region: your_bucket_region           # e.g.: nyc1 or us-east-1
    url: https://nyc1.digitaloceanspaces.com      # update the region to match your bucket
    credentialSecret:
      name: trilio-s3-target-secret
      namespace: tvk
  thresholdCapacity: 10Gi

Creating the Kubernetes Cluster Backup

k8s-cluster-backup-plan.yaml

apiVersion: triliovault.trilio.io/v1
kind: ClusterBackupPlan
metadata:
  name: k8s-cluster-backup-plan
  namespace: tvk
spec:
  backupConfig:
    target:
      name: trilio-s3-target
      namespace: tvk
  backupComponents:
    - namespace: wordpress
    - namespace: mysqldb
    - namespace: etcd
kubectl apply -f k8s-cluster-backup-plan.yaml
Outputclusterbackupplan.triliovault.trilio.io/k8s-cluster-backup-plan created
kubectl get clusterbackupplan k8s-cluster-backup-plan -n tvk

The output looks similar to this:

OutputNAME                              TARGET             ...   STATUS
k8s-cluster-backup-plan           trilio-s3-target   ...   Available

Screencapture showing the status of the Cluster Backup Plan

apiVersion: triliovault.trilio.io/v1
kind: ClusterBackup
metadata:
  name: k8s-cluster-backup
  namespace: tvk
spec:
  type: Full
  clusterBackupPlan:
    name: k8s-cluster-backup-plan
    namespace: tvk
kubectl apply -f k8s-cluster-backup.yaml
Outputclusterbackup.triliovault.trilio.io/k8s-cluster-backup created
kubectl get clusterbackup k8s-cluster-backup -n tvk
OutputNAME                 BACKUPPLAN               BACKUP TYPE   STATUS      ...   PERCENTAGE COMPLETE
k8s-cluster-backup   k8s-cluster-backup-plan  Full          Available   ...   100

Screencapture showing the status of the Cluster Backup

kubectl delete ns wordpress
kubectl delete ns mysqldb
kubectl delete ns etcd
Outputnamespace "wordpress" deleted
namespace "mysqldb" deleted
namespace "etcd" deleted

Now that your namespaces are deleted, you’ll restore the backup.

Restoring the Backup with the Management Console :

In this section, you will use the TVK web console to restore all the important applications from your backup. The restore process will validate the target where the backup is stored. TVK will connect to the target repository to pull the backup files using Datamover and metal over pods. TVK will create the Kubernetes application that was pulled from the backup storage.

To get started with the restore operation, you’ll first need to create your target

Checking the DOKS Cluster Applications State :

In this section, you will make sure that the restore operation was successful and that the applications are accessible after the restore. To begin, run the following commands to retrieve all of the objects related to the application from the namespaces listed:

kubectl get all --namespace wordpress
kubectl get all --namespace mysqldb
kubectl get all --namespace etcd

Your output will look similar to the following for each application:

OutputNAME                             READY   STATUS             RESTARTS   AGE
pod/wordpress-5dcf55f8fc-72h9q   1/1     Running            1          2m21s
pod/wordpress-mariadb-0          1/1     Running               1          2m20s

NAME                        TYPE           CLUSTER-IP     EXTERNAL-IP    PORT(S)                      AGE
service/wordpress           LoadBalancer   10.120.1.38    34.71.102.21   80:32402/TCP,443:31522/TCP   2m21s
service/wordpress-mariadb   ClusterIP      10.120.7.213   <none>         3306/TCP                     2m21s

NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/wordpress   1/1     1            1           2m21s

NAME                                   DESIRED   CURRENT   READY   AGE
replicaset.apps/wordpress-5dcf55f8fc   1         1         1       2m21s

NAME                                 READY   AGE
statefulset.apps/wordpress-mariadb   1/1     2m21s

Step 6 — Scheduling Backups :

Creating backups automatically based on a schedule is a very useful feature to have. It allows you to rewind time and restore the system to a previous working state if something goes wrong. By default, TrilioVault creates three scheduled policies: daily, weekly, and monthly.

In the TVK console, you can view the default policies under Backup & Recovery, then Scheduling Policies:

Screencapture showing the default scheduled policies in the TVK management console

scheduled-backup-every-5min.yaml

apiVersion: triliovault.trilio.io/v1
kind: Policy
apiVersion: triliovault.trilio.io/v1
metadata:
  name: scheduled-backup-every-5min
  namespace: tvk
spec:
  type: Schedule
  scheduleConfig:
    schedule:
      - "*/5 * * * *" # trigger every 5 minutes
kubectl apply -f scheduled-backup-every-5min.yaml

Your output will look like this:

Outputpolicy.triliovault.trilio.io/scheduled-backup-every-5min created

k8s-cluster-backup-plan

apiVersion: triliovault.trilio.io/v1
kind: ClusterBackupPlan
metadata:
  name: k8s-cluster-backup-plan
  namespace: tvk
spec:
  backupConfig:
    target:
      name: trilio-s3-target
      namespace: tvk
    schedulePolicy:
      fullBackupPolicy:
        name: scheduled-backup-every-5min
        namespace: tvk
  backupComponents:
    - namespace: wordpress
    - namespace: mysqldb
    - namespace: etcd

TVK also has a default retention policy, which you can view in the TVK console under Backup & Recovery, then Rentention Policies:

Screencapture showing the default retention policy in the TVK management console

sample-retention-policy.yaml

apiVersion: triliovault.trilio.io/v1
kind: Policy
metadata:
  name: sample-retention-policy
spec:
  type: Retention
  retentionConfig:
    latest: 2
    weekly: 1
    dayOfWeek: Wednesday
    monthly: 1
    dateOfMonth: 15
    monthOfYear: March

k8s-cluster-backup-plan :

apiVersion: triliovault.trilio.io/v1
kind: ClusterBackupPlan
metadata:
  name: k8s-cluster-backup-plan
  namespace: tvk
spec:
  backupConfig:
    target:
      name: trilio-s3-target
      namespace: tvk
    retentionPolicy:
      fullBackupPolicy:
        name: sample-retention-policy
        namespace: tvk
  backupComponents:
    - namespace: wordpress
    - namespace: mysqldb
    - namespace: etcd

Backing up and Restoring the Kubernetes Cluster :

nano k8s-cluster-backup-plan.yaml
apiVersion: triliovault.trilio.io/v1
kind: ClusterBackupPlan
metadata:
  name: k8s-cluster-backup-plan
  namespace: tvk
spec:
  backupConfig:
    target:
      name: trilio-s3-target
      namespace: tvk
  backupComponents:
    - namespace: wordpress
    - namespace: mysqldb
    - namespace: etcd
kubectl apply -f k8s-cluster-backup-plan.yaml
kubectl get clusterbackupplan k8s-cluster-backup-plan -n tvk
nano k8s-cluster-backup.yaml
apiVersion: triliovault.trilio.io/v1
kind: ClusterBackup
metadata:
  name: k8s-cluster-backup
  namespace: tvk
spec:
  type: Full
  clusterBackupPlan:
    name: k8s-cluster-backup-plan
    namespace: tvk
kubectl apply -f k8s-cluster-backup.yaml:

Autoscaling in Kubernetes :

Scaling a Kubernetes cluster involves adding or removing nodes to or from the cluster to increase or decrease its capacity. Here are the general steps to scale a Kubernetes cluster:

  1. Add new worker nodes to the cluster. You can add worker nodes using a cloud provider's console or API, or by provisioning new nodes using a tool like kubeadm.

  2. Join the new worker nodes to the Kubernetes cluster. You can join the nodes to the cluster using kubeadm or by running the kubectl join command with the appropriate flags.

  3. Verify that the new worker nodes are added to the cluster and functioning correctly. You can use the kubectl get nodes command to verify the nodes are added to the cluster and the kubectl describe node command to check their status.

  4. If desired, you can adjust the number of replicas for a deployment, stateful set, or replication controller to take advantage of the new capacity provided by the additional worker nodes. This can be done using the kubectl scale command.

  5. If necessary, you can also scale the control plane components, such as the API server, etcd, scheduler, and controller manager, to handle the increased workload. This can be done by adding more replicas or upgrading the resources assigned to each component.

  6. Monitor the cluster to ensure that everything is functioning correctly and that the new nodes are handling their share of the workload.

  7. If desired, you can remove nodes from the cluster to decrease its capacity. This involves draining the node of any running Pods, deleting the node from the cluster using kubectl, and optionally deleting the node from the cloud provider.

Setting Up Autoscaling on GCE

First, we set up a cluster with Cluster Autoscaler turned on. The number of nodes in the cluster will start at 2, and autoscale up to a maximum of 5. To implement this, we’ll export the following environment variables:

export NUM\_NODES=2

export KUBE\_AUTOSCALER\_MIN\_NODES=2

export KUBE\_AUTOSCALER\_MAX\_NODES=5

export KUBE\_ENABLE\_CLUSTER\_AUTOSCALER=true

start the cluster by running:

./cluster/kube-up.sh

Let’s see our cluster, it should have two nodes:

kubectl get nodes

Run & Expose PHP-Apache Server

kubectl run php-apache \
kubectl get deployment
kubectl run -i --tty service-test --image=busybox /bin/sh  
Hit enter for command prompt  
$ wget -q -O- http://php-apache.default.svc.cluster.local
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
 kubectl get hpa
 kubectl get hpa

Conclusion :

Backing up and restoring a Kubernetes cluster can be a complex task, but with TrilioVault for Kubernetes, it becomes a lot easier. By following the steps outlined in this blog post, you can ensure that your Kubernetes cluster is protected from disasters and data loss.

Backing up and restoring a Kubernetes cluster can be a complex task, but with TrilioVault for Kubernetes, it becomes a lot easier. By following the steps outlined in this blog post, you can ensure that your Kubernetes cluster is protected from disasters and data loss.

480+ Devops Team Stock Photos, Pictures & Royalty-Free ...

0
Subscribe to my newsletter

Read articles from Subho Dey directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Subho Dey
Subho Dey

"DevOps engineer with a passion for continuous improvement and a drive to build better software, faster. I'm a strong believer in the power of collaboration, automation, and agile methodologies to transform the world of software development and delivery. My expertise includes continuous integration and delivery, infrastructure as code, Docker, Kubernetes, and configuration management. Follow along as I share my insights and experiences on Hashnode and let's build better software, together!"