Fixing ECS Deployment Timeouts with Terraform & GitHub Actions

🚀 Introduction – The Problem

Our FeedbackHub project runs on AWS ECS Fargate with infrastructure managed by Terraform and deployments triggered through GitHub Actions.

Sounds neat, right? Except — our first deployments kept getting stuck.

ECS deployment status: IN_PROGRESS forever
Tasks: 3 running instead of 2
GitHub Actions: Timed out during wait services-stable

This was a real-world DevOps headache. Here’s how we diagnosed, fixed, and iterated.

(Note: The app occasionally “takes naps” to save AWS costs. Think of it as serverless beauty sleep.)

🔍 Phase 1 – OIDC & Pipeline Setup

Secure GitHub Actions authentication via OIDC IAM Role in AWS:

Created dedicated IAM role for GitHub Actions
Configured configure-aws-credentials@v4
Triggered first deployment

✅ OIDC worked flawlessly. ECS rollout… not so much.

🛠 Phase 2 – ALB Health Check Fix

Problem:

ALB called / → redirect (301/302)
Timeout too short

Fix:

Changed path to /api/health
Increased timeout to 10s
Temporarily allowed success codes 200-302

✅ ALB targets healthy.

⚙️ Phase 3 – ECS Container Health Check Fix

Problem:

ECS health check ran before app fully started (Next.js warmup + DB)
curl failed in Alpine

Fix:

Installed wget in Dockerfile
Updated health check command:

wget --no-verbose --tries=1 --spider http://localhost:3000/api/health || exit 1

Increased startPeriod to 120s (planning 300s)
Temporarily disabled ECS health checks to unblock

✅ ECS service stable, 2/2 running.

🎯 The Partial Win

App: Works perfectly
ECS: Deployment COMPLETED
Pipeline: No more timeout

(The app still naps when not in use. Cost optimization, but make it cozy.)

📚 Lessons Learned

Align ALB & ECS health checks
Allow realistic container boot times
It’s okay to relax checks temporarily
Pipelines must account for ECS timing

🔮 Phase 4 – What’s Next

Re-enable tuned ECS health checks
Add blue/green deployments
Tighten ALB success codes to 200
Enhance GitHub Actions with pre-success ECS/ALB checks

✅ Conclusion

This was real-world DevOps: unblock → iterate → stabilize.

💻 Repo: GitHub
🌐 App: Live FeedbackHub (May be snoozing to save the AWS bill!)
🔗 LinkedIn: Connect with me (Come for the DevOps talk, stay for the ECS nap jokes)

From ECS Timeout to CI/CD Green: A Real-World DevOps Journey