Debugging ECS Task Failures: How Terraform Remote State Saved My Deployment

Deploying to AWS ECS with Terraform sounded simple—until it wasn’t.
A few weeks ago, I was building a Dockerized app, pushing it to ECR, and deploying it via ECS Fargate using Terraform. Everything was fine... until suddenly, my ECS task started showing "STOPPED" status every time I deployed.
This blog shares exactly what went wrong, what I tried, what worked, and most importantly—what I learned.
The Setup
I'm learning DevOps hands-on by building real infrastructure using:
Terraform for infrastructure as code
Docker for containerization
AWS ECS (Fargate) to run containers
ECR to store Docker images
NGINX as the test app inside the container
I had two separate Terraform folders:
terraform/network
— created VPC, public subnets, route table, and security groupterraform/ecs
— defined ECS cluster, task definition, and service
Everything worked well initially.
The Problem: ECS Tasks Always Stopped
After making some changes in my terraform/network
stack (replacing the security group to open multiple ports), I applied the changes successfully.
But when I ran terraform apply
in the ecs/
folder to re-deploy my app, ECS launched the task... and it immediately went to STOPPED state.
I kept refreshing the ECS console, checking logs, trying different container ports, even rebuilding and pushing my Docker image several times. Nothing helped.
The Root Cause
After digging deeper, I discovered that:
My ECS service was still using the old security group ID, which no longer existed.
In my ECS Terraform file, the
network_configuration
block was hardcoded:
subnets = ["subnet-abc123", "subnet-def456"]
security_groups = ["sg-0123456789"]
- Since the old SG was gone (replaced during the network changes), ECS couldn't attach it to the task, and the task failed before starting.
This was a classic example of a broken dependency between infrastructure components.
The Fix: Terraform Remote State
Instead of hardcoding subnet and security group IDs, I needed a way for the ecs/
Terraform stack to dynamically read the correct, up-to-date values from the network/
stack.
The cleanest solution: terraform_remote_state
Step 1: Output subnet and SG from network
In terraform/network/outputs.tf
:
output "public_subnet_ids" {
value = aws_subnet.public[*].id
}
output "security_group_id" {
value = aws_security_group.ecs_sg.id
}
Step 2: Read remote state in ECS
In terraform/ecs/main.tf
:
data "terraform_remote_state" "network" {
backend = "local"
config = {
path = "../network/terraform.tfstate"
}
}
Step 3: Use those values in ECS service
network_configuration {
subnets = data.terraform_remote_state.network.outputs.public_subnet_ids
assign_public_ip = true
security_groups = [data.terraform_remote_state.network.outputs.security_group_id]
}
Once I made this change, everything just worked.
I applied the ECS Terraform again, refreshed the console, and this time the ECS task moved to RUNNING
state. Opening the public IP showed:
🚀 Hello from Deepak's Docker container (served via NGINX)!
🎉 Finally!
What I Learned (So You Don't Repeat It)
✅ Don't hardcode resource IDs
Use variables, outputs, and remote state instead
Hardcoded values become stale quickly, especially with Terraform's resource replacements
✅ Use outputs.tf
as a contract
- Think of it like a public API between your Terraform stacks
✅ Use terraform_remote_state
to wire your infra
- It allows clean separation of concerns: one stack defines, the other consumes
✅ A stopped ECS task means something failed at launch
Check IAM roles, SGs, subnets, logs
If nothing shows in logs, it's often a networking or IAM issue
What’s Next
This was my first real-world style ECS deployment. I learned more from this single issue than from hours of tutorials.
In upcoming blogs, I’ll cover:
Adding a Load Balancer (ALB) in front of ECS
Setting up CI/CD with GitHub Actions
Deploying a real Drupal app instead of a placeholder
If you're just starting your DevOps journey: don't worry if things break. That’s when you learn.
Thanks for reading 🙏
✅ You can view the full codebase here: GitHub - deepakaryan1988/Drupal-AWS
💬 DM me or comment if you’ve faced similar ECS issues — happy to share more!
#AWS #Terraform #DevOps #ECS #RemoteState #Debugging #Hashnode #LinkedIn
Subscribe to my newsletter
Read articles from Deepak Kumar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
