How to Set Up Kafka and ZooKeeper with High Availability on AWS ECS

Naren MalireddyNaren Malireddy
3 min read

Letโ€™s now dive into a practical deployment guide for running Kafka and ZooKeeper in a containerized, HA-ready setup using AWS ECS (EC2 or Fargate).

๐Ÿ“ฆ Architecture Overview

We will deploy:

  • Apache ZooKeeper cluster (3 nodes for HA quorum)

  • Apache Kafka brokers (3 brokers for replication and load balancing)

  • Each broker runs in its own ECS task

  • Internal networking via ECS Service Discovery

  • Data persistence with Amazon EBS (EC2) or EFS (Fargate)

                +----------------------+
                |    Kafka Clients     |
                +----------------------+
                          |
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ”‚    AWS ECS Cluster     โ”‚
              โ”‚   (Kafka + ZooKeeper)  โ”‚
              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
     +-------------------+    +-------------------+
     |  Kafka Broker 1   |    |  Kafka Broker 2   |
     | zookeeper:2181    |    | zookeeper:2181    |
     +-------------------+    +-------------------+
              |                      |
     +-------------------+    +-------------------+
     | ZooKeeper Node 1  |    | ZooKeeper Node 2  |
     +-------------------+    +-------------------+

๐Ÿงฑ Step-by-Step: Kafka & ZooKeeper on ECS


1. ๐Ÿ“ Containerize Kafka and ZooKeeper

Use Bitnami or Confluent Docker images (or build your own):

# docker-compose.yml (for local testing)
version: '3'
services:
  zookeeper:
    image: bitnami/zookeeper:3.9
    ports:
      - "2181:2181"
    environment:
      - ALLOW_ANONYMOUS_LOGIN=yes
      - ZOO_MY_ID=1
      - ZOO_SERVERS=server.1=zookeeper:2888:3888

  kafka:
    image: bitnami/kafka:3.6
    ports:
      - "9092:9092"
    environment:
      - KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
      - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092
      - ALLOW_PLAINTEXT_LISTENER=yes

Push these images to Amazon ECR.


2. ๐Ÿ› ๏ธ Create ECS Cluster (Fargate or EC2)

  • Use ECS Console or IaC (Terraform/CloudFormation)

  • For HA, prefer EC2 + Auto Scaling Group

  • Ensure subnets span multiple AZs

  • Attach ECS instances to a shared security group


3. ๐Ÿ” Set Up IAM, Security Groups & Networking

  • ECS task role with permissions for:

    • ECR pull

    • CloudWatch Logs

  • Security Groups:

    • Kafka โ†’ allow ports 9092

    • ZooKeeper โ†’ allow ports 2181, 2888, 3888

  • Enable ECS Service Discovery (via AWS Cloud Map or Route 53)


4. ๐Ÿšข Deploy ECS Services

Deploy each Kafka and ZooKeeper instance as a separate ECS service:

  • ๐ŸŸ  ZooKeeper:

    • Desired count: 3 (for quorum)

    • Static DNS names via ECS Cloud Map (zookeeper1.service.local, etc.)

  • ๐Ÿ”ต Kafka:

    • Desired count: 3

    • Each configured with:

      • KAFKA_CFG_BROKER_ID

      • KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper1.service.local:2181,...

Sample ECS Task Definition Snippet (Kafka)

{
  "containerDefinitions": [
    {
      "name": "kafka-broker",
      "image": "your_account_id.dkr.ecr.region.amazonaws.com/kafka:latest",
      "essential": true,
      "portMappings": [
        { "containerPort": 9092, "hostPort": 9092 }
      ],
      "environment": [
        { "name": "KAFKA_CFG_BROKER_ID", "value": "1" },
        { "name": "KAFKA_CFG_ZOOKEEPER_CONNECT", "value": "zookeeper1:2181,zookeeper2:2181,zookeeper3:2181" },
        { "name": "ALLOW_PLAINTEXT_LISTENER", "value": "yes" }
      ]
    }
  ],
  "requiresCompatibilities": ["EC2"],
  "memory": "2048",
  "cpu": "1024"
}

๐Ÿ’ก Tip: Use placementConstraints to spread tasks across Availability Zones for HA.


5. ๐Ÿ’พ Enable Data Persistence

  • ZooKeeper & Kafka require persistent storage

  • Use EBS volumes mounted to ECS tasks (if EC2-backed)

  • Or use Amazon EFS with Fargate


6. ๐Ÿงช Validate the Cluster

  • Kafka CLI or client library can produce/consume:

      kafka-console-producer.sh --broker-list kafka1:9092 --topic test
      kafka-console-consumer.sh --bootstrap-server kafka1:9092 --topic test --from-beginning
    
  • Use CloudWatch Logs kafka-topics.shand ZooKeeper CLI tools for health checks


๐Ÿงฏ High Availability Best Practices

ComponentHA Strategy
ZooKeeperMinimum 3 nodes for quorum; spread across AZs
Kafka Brokers3+ brokers with replication (min.insync.replicas=2)
StorageEBS or EFS with redundancy
ECSAuto Scaling across AZs
DNSUse Cloud Map or Route 53 for service discovery
MonitoringCloudWatch, Kafka Exporter + Prometheus

๐Ÿ“Œ Conclusion

Apache Kafka and ZooKeeper bring critical capabilities to microservices: event streaming, durability, and fault-tolerant coordination. Deploying them on AWS ECS with high availability ensures

  • Scalability across services

  • Resilience to AZ or task failures

  • Decoupled, event-driven communication

With ECS, you offload much of the orchestration while gaining flexibility to manage Kafka clusters as containerized servicesโ€”without locking into managed services too early.

0
Subscribe to my newsletter

Read articles from Naren Malireddy directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Naren Malireddy
Naren Malireddy

Hi, Iโ€™m Narendra Reddy Malireddy โ€” or just Naren. Iโ€™m a principal architect with over 20+ years of experience designing and delivering large-scale software and infrastructure solutions across the retail, finance, and tech sectors. My journey spans computer networks, cloud platforms, and DevOps โ€” and today, I specialize in helping organizations build secure, scalable, and high-performing systems, whether thatโ€™s on-prem, in the cloud, or in hybrid environments. What drives me is the intersection of technology and business impact. I focus on enterprise IT architecture, cloud transformation (AWS, Azure, GCP), and DevSecOps โ€” always with an eye on security, efficiency, and long-term sustainability. Certified as a Cloud Architect and a SAFeยฎ 6 Practitioner, Iโ€™m experienced in leading cross-functional teams within Agile and Scaled Agile frameworks. I pride myself on turning complex business challenges into future-ready, cost-effective technical solutions that move the needle. ๐Ÿ”‘ Some of my key strengths: Multi-region cloud architecture (AWS, Azure, GCP) CI/CD, Kubernetes, and secure DevOps/DevSecOps practices Identity, compliance, and threat detection in cloud-native environments Agile delivery using SAFe, ITIL, and Six Sigma Strategic leadership and stakeholder alignment during digital transformations Beyond just implementing technology, I care deeply about delivering measurable outcomes and building strong, lasting partnerships.