Implementing Chaos Engineering on Redis with LitmusChaos: Simulating Leader Pod Failures


Introduction
In today's distributed systems, ensuring the resilience of stateful applications like Redis is paramount. Chaos engineering provides a proactive approach to identify potential weaknesses by intentionally introducing failures.
Prerequisites
Before diving into the chaos experiments, ensure you have the following:
Kubernetes Cluster: A running Kubernetes cluster (v1.20 or later).
kubectl: Command-line tool configured to interact with your cluster.
Helm: Package manager for Kubernetes applications.
LitmusChaos: Installed in your cluster. If not already installed, follow the LitmusChaos installation guide.
Redis Cluster: Deployed in your Kubernetes environment.
Understanding the Kubernetes Architecture
Before we start, it's essential to understand how Kubernetes manages workloads. Redis in Kubernetes typically runs as a StatefulSet, ensuring ordered and persistent pod management.
Step 1: Deploying a Redis Cluster
For this tutorial, we'll deploy a Redis cluster using the Bitnami Helm chart.
1. Add the Bitnami Repository
helm repo add bitnami https://charts.bitnami.com/bitnami
2. Install Redis Cluster
helm install redis-cluster bitnami/redis-cluster
This command deploys a Redis cluster with default configurations.
3. Verify the Redis Cluster Deployment
kubectl get pods
Expected output:
NAME READY STATUS RESTARTS AGE
redis-cluster-0 1/1 Running 0 2m
redis-cluster-1 1/1 Running 0 2m
redis-cluster-2 1/1 Running 0 2m
Step 2: Installing LitmusChaos
If LitmusChaos isn't already installed in your cluster, proceed with the following steps:
1. Add Litmus Helm Repository
helm repo add litmuschaos https://litmuschaos.github.io/litmus-helm/
2. Install LitmusChaos
helm install litmus litmuschaos/litmus
This installs the LitmusChaos components in the litmus
namespace.
Step 3: Understanding Chaos Experiments
LitmusChaos offers various predefined experiments. For simulating a leader pod failure in Redis, we'll use the pod-delete experiment.
The experiment will delete a Redis leader pod and observe how the system recovers.
Step 4: Configuring the Chaos Experiment
1. Create a Chaos Namespace
kubectl create namespace redis-chaos
2. Define the Chaos Experiment
Create a file named redis-pod-delete-experiment.yaml
with the following content:
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosExperiment
metadata:
name: pod-delete
namespace: redis-chaos
spec:
definition:
scope: Namespaced
permissions:
- apiGroups: [""]
resources: ["pods"]
verbs: ["delete"]
image: "litmuschaos/go-runner:latest"
args:
- -c
- ./experiments/generic/pod_delete/pod_delete.test
command:
- /bin/bash
3. Apply the Chaos Experiment
kubectl apply -f redis-pod-delete-experiment.yaml
Step 5: Configuring the Chaos Engine
The ChaosEngine links the application under test (Redis) with the chaos experiment.
1. Create a ChaosEngine Definition
Create a file named redis-chaos-engine.yaml
with the following content:
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: redis-chaos
namespace: redis-chaos
spec:
appinfo:
appns: default
applabel: "app.kubernetes.io/name=redis-cluster"
appkind: StatefulSet
chaosServiceAccount: litmus-admin
experiments:
- name: pod-delete
spec:
components:
env:
- name: TARGET_POD
value: "redis-cluster-0"
2. Apply the ChaosEngine
kubectl apply -f redis-chaos-engine.yaml
Step 6: Running the Chaos Experiment
Once everything is set up, we can now initiate the chaos experiment.
- Start the Experiment
kubectl annotate statefulset redis-cluster litmuschaos.io/chaos="true"
kubectl apply -f redis-chaos-engine.yaml
- Observe the Experiment Execution
kubectl get pods -n redis-chaos
This command will show the status of the LitmusChaos experiment.
- Monitor Logs
To monitor experiment execution in real-time:
kubectl logs -f <chaos-pod-name> -n redis-chaos
Step 7: Analyzing the Results
Once the experiment completes, observe the impact on the Redis cluster:
Was the leader pod successfully deleted?
Did the cluster automatically elect a new leader?
Were there any service disruptions?
Use kubectl describe pods
and kubectl logs
commands to analyze the system’s behavior.
Subscribe to my newsletter
Read articles from Saurav Anand directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
