From ECS Timeout to CI/CD Green: A Real-World DevOps Journey


🚀 Introduction – The Problem
Our FeedbackHub project runs on AWS ECS Fargate with infrastructure managed by Terraform and deployments triggered through GitHub Actions.
Sounds neat, right? Except — our first deployments kept getting stuck.
ECS deployment status:
IN_PROGRESS
foreverTasks:
3 running
instead of2
GitHub Actions: Timed out during
wait services-stable
This was a real-world DevOps headache. Here’s how we diagnosed, fixed, and iterated.
(Note: The app occasionally “takes naps” to save AWS costs. Think of it as serverless beauty sleep.)
🔍 Phase 1 – OIDC & Pipeline Setup
Secure GitHub Actions authentication via OIDC IAM Role in AWS:
Created dedicated IAM role for GitHub Actions
Configured
configure-aws-credentials@v4
Triggered first deployment
✅ OIDC worked flawlessly. ECS rollout… not so much.
🛠 Phase 2 – ALB Health Check Fix
Problem:
ALB called
/
→ redirect (301/302)Timeout too short
Fix:
Changed path to
/api/health
Increased timeout to
10s
Temporarily allowed success codes
200-302
✅ ALB targets healthy.
⚙️ Phase 3 – ECS Container Health Check Fix
Problem:
ECS health check ran before app fully started (Next.js warmup + DB)
curl
failed in Alpine
Fix:
Installed
wget
in DockerfileUpdated health check command:
wget --no-verbose --tries=1 --spider http://localhost:3000/api/health || exit 1
Increased
startPeriod
to120s
(planning300s
)Temporarily disabled ECS health checks to unblock
✅ ECS service stable, 2/2
running.
🎯 The Partial Win
App: Works perfectly
ECS: Deployment
COMPLETED
Pipeline: No more timeout
(The app still naps when not in use. Cost optimization, but make it cozy.)
📚 Lessons Learned
Align ALB & ECS health checks
Allow realistic container boot times
It’s okay to relax checks temporarily
Pipelines must account for ECS timing
🔮 Phase 4 – What’s Next
Re-enable tuned ECS health checks
Add blue/green deployments
Tighten ALB success codes to
200
Enhance GitHub Actions with pre-success ECS/ALB checks
✅ Conclusion
This was real-world DevOps: unblock → iterate → stabilize.
💻 Repo: GitHub
🌐 App: Live FeedbackHub (May be snoozing to save the AWS bill!)
🔗 LinkedIn: Connect with me (Come for the DevOps talk, stay for the ECS nap jokes)
Subscribe to my newsletter
Read articles from Deepak Kumar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
