How to Set Up Disaster Recovery (DR) for AWS MSK with MirrorMaker 2 โ€“ Step-by-Step Guide

DevOpsofworldDevOpsofworld
5 min read

In today's cloud-native world, ensuring high availability and resilience for streaming platforms like Apache Kafka is mission-critical. Amazon MSK (Managed Streaming for Apache Kafka) offers a powerful, fully managed Kafka service. However, it doesn't natively provide cross-region disaster recovery (DR). In this guide, youโ€™ll learn how to configure cross-region DR for AWS MSK using Apache Kafka MirrorMaker 2 (MM2) โ€” a robust, open-source replication tool.

This comprehensive walkthrough includes prerequisites, cluster setup, networking, and end-to-end validation to help you build a production-ready DR solution.


๐Ÿ” What is AWS MSK?

AWS MSK (Managed Streaming for Apache Kafka) is a fully managed service that simplifies running Apache Kafka on AWS. It eliminates the operational overhead of provisioning servers, configuring clusters, and managing availability.

Key features of AWS MSK:

  • Fully managed Apache Kafka

  • Secure by default with VPC, TLS, and IAM integration

  • Scalable with automatic broker scaling and storage expansion

  • Native support for monitoring via CloudWatch and logging integrations


๐Ÿ”„ What is MirrorMaker 2?

MirrorMaker 2 (MM2) is the enhanced replication utility introduced in Apache Kafka 2.4+. Itโ€™s designed for copying data between Kafka clusters and is built on Kafka Connect, providing modularity, scalability, and fault tolerance.

Key capabilities:

  • Real-time replication of topics and consumer offsets

  • Support for multiple clusters

  • Active-passive and active-active configurations

  • Flexible replication policies and error handling


๐Ÿ’ก Available Methods for Kafka DR โ€“ Why MirrorMaker 2?

Several options exist for disaster recovery in Kafka:

MethodReal-TimeOffset SyncCostComplexityDescription
MirrorMaker 2 (MM2)โœ…โœ…Lowโ€“MedMediumOpen-source Kafka-native tool ideal for AWS MSK with IAM support.
Confluent Replicatorโœ…โœ…HighHighCommercial-grade tool with advanced features.
Custom Producers/Consumersโœ…โŒMediumHighBuild-your-own with full control.
Kafka Streams or Flinkโœ…โŒHighHighStream processing with built-in replication logic.
S3 Backup & RestoreโŒโŒLowLowPeriodic export-import, cold DR only.

Why we chose MirrorMaker 2 for this guide:

  • Seamless integration with AWS MSK and IAM authentication

  • No additional licensing or external dependencies

  • Good balance of simplicity, performance, and reliability


๐Ÿงฐ Prerequisites

To follow this tutorial, ensure you have:

  • An AWS account with permissions for MSK, EC2, IAM, and VPC

  • AWS CLI installed and configured

  • Java 11+ installed on the EC2 instance

  • Kafka client tools (Apache Kafka binaries)

  • Two VPCs in different regions (e.g., ap-south-1 and us-east-1)

  • AWS MSK IAM Authentication JAR: aws-msk-iam-auth.jar


๐Ÿ—๏ธ Step 1: Create Primary and DR MSK Clusters

๐Ÿ”น Create Primary Cluster (ap-south-1)

  1. Go to Amazon MSK > Create Cluster

  2. Select Custom create

  3. Cluster name: msk-primary

  4. Kafka version: 3.0+

  5. Brokers: kafka.m5.large, 3 brokers, 1000 GiB EBS each

  6. Network: Select VPC and 3 subnets

  7. Enable:

    • Encryption at rest (KMS)

    • TLS in-transit encryption

    • IAM authentication

  8. Assign a security group to allow port 9198 from EC2

๐Ÿ”น Create DR Cluster (us-east-1)

Repeat the above steps in the us-east-1 region, using the cluster name msk-dr. Ensure consistent configuration across clusters.


๐Ÿ” Step 2: Configure Network and Security

Update security group rules:

  • MSK SG: Allow inbound TCP 9198 from EC2 SG or IP

  • EC2 SG: Allow outbound 9198 to both MSK clusters

  • Enable SSH (port 22) on EC2 for management access


๐Ÿ’ป Step 3: Launch EC2 with Kafka Client

๐Ÿš€ Launch EC2

  • Region: ap-south-1

  • Type: t3.medium or higher

  • Attach IAM role for MSK and Secrets Manager (if needed)

  • Ensure internet access (NAT or public IP)

๐Ÿ› ๏ธ Install Tools and Configure

sudo yum update -y
sudo yum install -y java-11-amazon-corretto
wget https://downloads.apache.org/kafka/3.3.1/kafka_2.13-3.3.1.tgz
tar -xzf kafka_2.13-3.3.1.tgz
export KAFKA_HOME=$(pwd)/kafka_2.13-3.3.1

wget https://github.com/aws/aws-msk-iam-auth/releases/latest/download/aws-msk-iam-auth.jar
export IAM_JAR=$(pwd)/aws-msk-iam-auth.jar

โœ๏ธ Create client.properties

security.protocol=SASL_SSL
sasl.mechanism=AWS_MSK_IAM
sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required;
sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandler

ssl.truststore.location=/home/ec2-user/msk-certs/truststore.jks
ssl.truststore.password=anjali

โ˜๏ธ Make sure the truststore includes AWS MSKโ€™s CA certificate.


๐Ÿงต Step 4: Create Kafka Topic on Primary Cluster

CLASSPATH=$IAM_JAR:$KAFKA_HOME/libs/* $KAFKA_HOME/bin/kafka-topics.sh \
  --create \
  --topic test-topic \
  --partitions 3 \
  --replication-factor 3 \
  --bootstrap-server <primary-broker-list> \
  --command-config /home/ec2-user/msk-certs/client.properties

Replace <primary-broker-list> with your actual MSK bootstrap brokers.


โš™๏ธ Step 5: Configure and Run MirrorMaker 2

โœ๏ธ Create mm2.properties

clusters = primary,dr

primary.bootstrap.servers=<primary-brokers>
primary.security.protocol=SASL_SSL
primary.sasl.mechanism=AWS_MSK_IAM
primary.sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required;
primary.ssl.truststore.location=/home/ec2-user/msk-certs/kafka.client.truststore.jks
primary.ssl.truststore.password=anjali

dr.bootstrap.servers=<dr-brokers>
dr.security.protocol=SASL_SSL
dr.sasl.mechanism=AWS_MSK_IAM
dr.sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required;
dr.ssl.truststore.location=/home/ec2-user/msk-certs/kafka.client.truststore.jks
dr.ssl.truststore.password=anjali

tasks.max=2
topics=test-topic
groups=.*
replication.policy.class=org.apache.kafka.connect.mirror.DefaultReplicationPolicy

โ–ถ๏ธ Run MM2

CLASSPATH=$IAM_JAR:$KAFKA_HOME/libs/*:$KAFKA_HOME/libs/connect-runtime-*.jar:$KAFKA_HOME/libs/connect-api-*.jar \
  $KAFKA_HOME/bin/connect-mirror-maker.sh /home/ec2-user/mm2.properties

โœ… Step 6: Validate Replication

๐Ÿ”Ž List Topics on DR

CLASSPATH=$IAM_JAR:$KAFKA_HOME/libs/* $KAFKA_HOME/bin/kafka-topics.sh \
  --list \
  --bootstrap-server <dr-broker> \
  --command-config /home/ec2-user/msk-certs/client.properties

๐Ÿ“ฅ Consume Messages from DR

CLASSPATH=$IAM_JAR:$KAFKA_HOME/libs/* $KAFKA_HOME/bin/kafka-console-consumer.sh \
  --topic mm2-test-topic \
  --from-beginning \
  --bootstrap-server <dr-broker> \
  --consumer.config /home/ec2-user/msk-certs/client.properties

โœ๏ธ Step 7: Test Message Flow

๐Ÿ“จ Produce to Primary

CLASSPATH=$IAM_JAR:$KAFKA_HOME/libs/* $KAFKA_HOME/bin/kafka-console-producer.sh \
  --topic test-topic \
  --bootstrap-server <primary-broker> \
  --producer.config /home/ec2-user/msk-certs/client.properties

Type some messages and hit Enter.

โœ… Confirm on DR

Re-run the consumer from the DR cluster to verify real-time replication.

๐Ÿ’ฌ Have you implemented DR for Kafka in your architecture?
Drop your approach or challenges in the comments โ€” I'd love to hear how others tackle cross-region resilience!

๐Ÿ”” Follow me for more AWS infrastructure and streaming data posts!

1
Subscribe to my newsletter

Read articles from DevOpsofworld directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

DevOpsofworld
DevOpsofworld