Automate ECS with Terraform: Scalable Solutions

1. Introduction

When deploying microservices on Amazon ECS Fargate, the manual setup of repositories, task definitions, services, load balancers, and Service Connect becomes tedious and error-prone. Add features like dynamic environment variables, secrets, sidecar containers (CloudWatch Agent), health checks, service discovery, and Service Connect logging, and the complexity only grows.

This is where Terraform automation shines. In this post, I’ll show you how I built a modular, production-grade Terraform solution that makes ECS service creation:

Repeatable — define your services once in JSON, and Terraform provisions everything.
Dynamic — environment variables, secrets, ports, mount points, and volumes can be injected at runtime.
Flexible — supports Service Connect, ALB integration, health checks, CloudWatch Agent sidecar, and ECS-managed tags.
Scalable — spin up one service or 20 in a single terraform apply.

2. Architecture & Goals

Our Terraform project automates the following for each ECS service:

ECR repository (for container images).
ECS Task Definition (main container + optional CloudWatch Agent sidecar + volumes).
ECS Fargate Service (with Service Connect, ALB listener rules, and target groups).

We wanted it to:

Use modular Terraform for reusability.
Drive configuration via a JSON file (services.json) for dynamic service onboarding.
Support per-service overrides (CPU/memory, secrets, logging, Service Connect, etc.).
Provide production features like health check grace period, ECS managed tags, CloudWatch logging, and volumes.

3. Terraform Project Structure

Here’s the recommended repo layout:

terraform-ecs-modular/
├─ modules/
│  ├─ ecr/
│  ├─ task_definition/
│  └─ service/
├─ examples/
│  └─  services.json
├─ main.tf
├─ variables.tf
├─ outputs.tf
├─ providers.tf
└─ README.md

4. Key Modules

ECR Module

The ecr module creates an ECR repository per service.

resource "aws_ecr_repository" "this" {
  name                 = var.name
  image_tag_mutability = "MUTABLE"
  encryption_configuration { encryption_type = "AES256" }
}

This ensures each microservice has its own repository for container images.

Task Definition Module

This is the heart of our automation. It supports:

Dynamic env vars and secrets (from JSON).
Default port mappings with appProtocol = "http".
Memory hard and soft limits per container.
Optional CloudWatch Agent sidecar container with shared volume mounts.

Example snippet with CloudWatch Agent support:

locals {
  main_container_def = {
    name         = var.container_name
    image        = var.image
    cpu          = var.cpu
    memory       = var.container_memory_hard
    memoryReservation = var.container_memory_soft
    essential    = true
    portMappings = [...]
    environment  = [...]
    mountPoints  = var.main_container_mount_points
    logConfiguration = {
      logDriver = var.log_driver
      options   = var.log_options
    }
  }

  cloudwatch_container_def = var.enable_cloudwatch_agent ? [
    {
      name  = var.cloudwatch_agent_config.name
      image = var.cloudwatch_agent_config.image
      mountPoints = var.cloudwatch_agent_config.mount_points
      logConfiguration = var.cloudwatch_agent_config.log_configuration
    }
  ] : []

  container_definitions_list = concat([local.main_container_def], local.cloudwatch_container_def)
}

resource "aws_ecs_task_definition" "this" {
  family                   = var.family
  requires_compatibilities = ["FARGATE"]
  cpu                      = var.task_cpu
  memory                   = var.task_memory
  container_definitions    = jsonencode(local.container_definitions_list)

  dynamic "volume" {
    for_each = var.volumes
    content {
      name      = volume.value.name
      host_path = try(volume.value.host_path, null)
    }
  }
}

Service Module

This module provisions the ECS service itself:

Uses existing ALB and creates a new target group + listener rule.
Enables ECS managed tags.
Configures health check grace period.
Supports Service Connect (namespace by name or ARN, client-server mode, logs).

resource "aws_ecs_service" "this" {
  name            = var.service_name
  cluster         = var.cluster_arn
  task_definition = var.task_definition_arn
  desired_count   = var.desired_count

  enable_ecs_managed_tags          = true
  propagate_tags                   = "SERVICE"
  health_check_grace_period_seconds = 60

  network_configuration {
    subnets          = var.subnet_ids
    security_groups  = var.security_group_ids
    assign_public_ip = var.assign_public_ip
  }

  dynamic "service_connect_configuration" {
    for_each = var.enable_service_connect ? [1] : []
    content {
      namespace      = var.service_connect_namespace
      discovery_name = var.service_connect_discovery_name
      service {
        port_name = var.service_connect_port_name
        port      = var.container_port
        client_alias {
          port     = var.container_port
          dns_name = var.service_connect_client_dns_name
        }
      }
      log_configuration {
        log_driver = "awslogs"
        options = {
          awslogs-group  = var.service_connect_log_group
          awslogs-region = var.service_connect_log_region
        }
      }
    }
  }
}

5. JSON-Driven Services

The beauty of this approach is the services.json file. Instead of duplicating Terraform code, we declare each service in JSON and Terraform loops through it.

Example:

{
  "my-example-service1": {
    "service_name": "my-example-service1",
    "ecr_name": "my-example-service1",
    "task_family": "my-example-service1-td",

    "container": {
      "name": "my-example-service1",
      "image": "111122223333.dkr.ecr.us-east-1.amazonaws.com/my-example-service1:latest",
      "cpu": 512,
      "memory": 1024,
      "port_mappings": [{ "container_port": 8080 }]
    },

    "environment": { "SPRING_PROFILE_ACTIVE": "dev" },
    "secrets": { "DB_PASSWORD": "arn:aws:secretsmanager:us-east-1:123:secret:db-pass" },

    "main_container_mount_points": [
      { "source_volume": "cms-logs", "container_path": "/app/logs", "read_only": false }
    ],
    "volumes": [{ "name": "cms-logs" }],

    "enable_cloudwatch_agent": true,
    "cloudwatch_agent_config": {
      "name": "cms-cloudwatch-agent",
      "image": "111122223333.dkr.ecr.us-east-1.amazonaws.com/cloudwatch-agent:latest",
      "cpu": 0,
      "environment": [{ "name": "service_name", "value": "my-example-service1" }],
      "mount_points": [
        { "source_volume": "my-example-service1-logs", "container_path": "/logs/my-example-service1", "read_only": false }
      ],
      "log_configuration": {
        "log_driver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/my-example-service1-cloudwatch-agent",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      }
    },

    "target_group_port": 8080,
    "health_check_path": "/health",
    "enable_service_connect": true
  }
}

6. Advanced Features We Covered

✅ Dynamic env vars + secrets via JSON
✅ Service Connect with logging and client-server mode
✅ ALB integration with auto-generated priorities
✅ ECS managed tags and health check grace period
✅ CloudWatch Agent sidecar container with shared log volume
✅ Dynamic volumes and mount points
✅ Memory hard + soft limits at container level

7. Example Workflow

Clone the repo
Update examples/services.json with your services
Set AWS vars in terraform.tfvars:

aws_region        = "us-east-1"
cluster_arn       = "arn:aws:ecs:us-east-1:123456789:cluster/my-cluster"
vpc_id            = "vpc-abc123"
subnet_ids        = ["subnet-123", "subnet-456"]
security_group_ids = ["sg-12345"]
listener_arn      = "arn:aws:elasticloadbalancing:us-east-1:123:listener/app/my-alb/xxx/yyy"

Run Terraform:

terraform init
terraform plan
terraform apply

8. Full Code Repository

Want to try this setup in your own AWS environment?
I’ve published the complete Terraform project with modules, JSON examples, and usage instructions in my GitHub repo:

👉 View the Full Code on GitHub

Feel free to ⭐️ the repo if you find it useful!

9. Closing Thoughts

This modular setup allows you to scale ECS adoption across dozens of microservices without copy-pasting Terraform code.

It’s scalable to add as may sidecars as you need dynamically
Infra teams can manage shared modules.
App teams just drop service configs into JSON.
Features like CloudWatch sidecars, Service Connect, and ALB integration are opt-in per service.

Future improvements could include - Automated ALB listener priority conflict resolution

Thank you for taking the time to read my post! 🙌 If you found it insightful, I’d truly appreciate a like and share to help others benefit as well.

Production-Grade ECS Service Automation with Terraform: Dynamic, Modular, and Scalable

Table of contents