A comprehensive guide to deploying a production-ready Node.js application on Amazon Elastic Kubernetes Service, covering infrastructure challenges, troubleshooting methodologies, and best practices.

Introduction

Deploying applications to AWS EKS involves navigating multiple layers of complexity, from infrastructure provisioning to container orchestration. This article documents the technical challenges encountered while deploying a Node.js todo application to a production EKS environment, along with the systematic approaches used to resolve each issue.

Project Overview

The deployment target was a containerized Node.js application featuring a modern web interface for task management. The application included Docker containerization, Kubernetes manifests, and was designed for high-availability deployment on AWS EKS infrastructure.

Challenge 1: Repository Management and Version Control

Problem Analysis

Prior to initiating the Kubernetes deployment, repository connectivity issues arose that required systematic resolution while preserving local development work and maintaining synchronization with the remote repository.

Resolution Approach

The repository connection was restored using standard Git recovery procedures. The primary challenge involved resolving branch divergence where large binary files had been inadvertently committed to the repository history.

Repository cleanup was accomplished using Git's history rewriting capabilities, followed by implementing proper .gitignore configurations to prevent similar issues in future development cycles.

Key Takeaway: Proper repository hygiene and version control practices are essential for maintaining clean development workflows, particularly when working with infrastructure-as-code tools that generate large binary artifacts.

Challenge 2: Infrastructure Version Management

Problem Analysis

The existing Terraform configuration utilized Kubernetes version 1.30, which required upgrading to version 1.32 to leverage enhanced security features and improved compatibility with the target deployment architecture.

Implementation Strategy

The version upgrade was implemented through Terraform variable updates:

# variables.tf
variable "cluster_version" {
  description = "Kubernetes version for the EKS cluster"
  type        = string
  default     = "1.32"
}

# terraform.tfvars.example
cluster_version = "1.32"

The infrastructure changes were validated and applied using standard Terraform workflows:

terraform plan
terraform apply

Challenge 3: EKS Infrastructure Provisioning

Infrastructure Architecture

The EKS cluster was provisioned using Terraform with the community-maintained AWS EKS module:

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "~> 20.0"

  cluster_name    = var.cluster_name
  cluster_version = var.cluster_version

  vpc_id                         = module.vpc.vpc_id
  subnet_ids                     = module.vpc.private_subnets
  cluster_endpoint_public_access = true

  eks_managed_node_groups = {
    main = {
      name = "main-node-group"
      instance_types = ["t3.micro"]
      min_size     = 1
      max_size     = 3
      desired_size = 2

      vpc_security_group_ids = [aws_security_group.node_group_one.id]
    }
  }

  tags = var.tags
}

The infrastructure deployment completed successfully, establishing a two-node cluster running Kubernetes 1.32.3 with appropriate networking and security configurations.

Challenge 4: Container Platform Architecture Mismatch

Problem Analysis

During application deployment, the pods failed to start with ImagePullBackOff errors. Investigation revealed a platform architecture incompatibility:

kubectl describe pod simple-node-app-xxx -n simple-node-app

Events:
  Warning  Failed     2m    kubelet  Failed to pull image "akpadetsi/simple-node-app:latest": 
  no match for platform in manifest: not found

The root cause was identified as a mismatch between the container image architecture (ARM64) and the target execution environment (x86_64).

Resolution Strategy

The issue was resolved by rebuilding the container image for the target platform architecture:

# Build for target platform architecture
docker buildx build --platform linux/amd64 -t akpadetsi/simple-node-app:latest .

# Deploy updated image
docker push akpadetsi/simple-node-app:latest

# Validate platform compatibility
docker manifest inspect akpadetsi/simple-node-app:latest

The deployment was updated to utilize the corrected image:

kubectl rollout restart deployment simple-node-app -n simple-node-app

Key Takeaway: Container image architecture must align with the target execution environment. Multi-platform builds should be considered for environments with heterogeneous architectures.

Challenge 5: Service Layer Architecture Design

Problem Analysis

The initial Kubernetes manifests included only ClusterIP and NodePort services, which were insufficient for providing external internet access to the application. A LoadBalancer service was required to integrate with AWS networking infrastructure.

Implementation Strategy

A comprehensive service architecture was implemented to support multiple access patterns:

# Internal cluster communication
apiVersion: v1
kind: Service
metadata:
  name: simple-node-app-service
  namespace: simple-node-app
spec:
  selector:
    app: simple-node-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 3000
  type: ClusterIP
---
# Development and debugging access
apiVersion: v1
kind: Service
metadata:
  name: simple-node-app-nodeport
  namespace: simple-node-app
spec:
  selector:
    app: simple-node-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 3000
      nodePort: 30080
  type: NodePort
---
# Production external access
apiVersion: v1
kind: Service
metadata:
  name: simple-node-app-loadbalancer
  namespace: simple-node-app
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-protocol: "TCP"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-port: "traffic-port"
spec:
  selector:
    app: simple-node-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 3000
  type: LoadBalancer

Challenge 6: Resource Capacity and Pod Scheduling Constraints

Problem Analysis

Following the container image resolution, pod scheduling failures were observed:

kubectl get pods -n simple-node-app

NAME                               READY   STATUS    RESTARTS   AGE
simple-node-app-7f79555d84-8lmzs   1/1     Running   0          10m
simple-node-app-7f79555d84-v9hz5   1/1     Running   0          35m
simple-node-app-7f79555d84-xyz12   0/1     Pending   0          28m

Diagnostic analysis revealed scheduling constraints:

kubectl describe pod simple-node-app-7f79555d84-xyz12 -n simple-node-app

Events:
  Warning  FailedScheduling  0/2 nodes are available: 2 Too many pods. 
  preemption: 0/2 nodes are available: 2 No preemption victims found for incoming pod.

Root Cause Analysis

Node capacity investigation revealed the underlying constraint:

kubectl describe nodes | grep -A 15 "Capacity:\|Allocatable:"

Capacity:
  cpu:                2
  memory:             926572Ki
  pods:               4

Allocatable:
  cpu:                1930m
  memory:             517996Ki
  pods:               4

The t3.micro instance type imposes a maximum pod density of 4 pods per node due to AWS ENI (Elastic Network Interface) limitations. With system pods consuming 3 slots per node, only 1 slot remained available for application workloads.

Resolution Strategy

Two approaches were evaluated:

Horizontal scaling of node infrastructure
Workload optimization to match available capacity

The workload optimization approach was selected for cost efficiency:

kubectl scale deployment simple-node-app --replicas=2 -n simple-node-app

This configuration achieved optimal resource utilization with high availability:

kubectl get pods -n simple-node-app -o wide

NAME                               READY   STATUS    NODE
simple-node-app-7f79555d84-8lmzs   1/1     Running   ip-10-0-2-9.ec2.internal
simple-node-app-7f79555d84-v9hz5   1/1     Running   ip-10-0-3-237.ec2.internal

The final configuration distributed one application pod per node across multiple Availability Zones, ensuring both resource efficiency and fault tolerance.

Challenge 7: LoadBalancer DNS Resolution and Connectivity

Problem Analysis

Despite successful LoadBalancer service provisioning, application accessibility through the assigned DNS endpoint was failing with DNS resolution errors.

kubectl get services -n simple-node-app

NAME                           TYPE           EXTERNAL-IP
simple-node-app-loadbalancer   LoadBalancer   a9a39492f4cc7459eaecbbb286f5c231-275b372e95ac4b82.elb.us-east-1.amazonaws.com

Initial connectivity tests revealed DNS resolution failures:

curl http://a9a39492f4cc7459eaecbbb286f5c231-275b372e95ac4b82.elb.us-east-1.amazonaws.com
# curl: (6) Could not resolve host

Diagnostic Methodology

DNS Resolution Analysis

nslookup a9a39492f4cc7459eaecbbb286f5c231-275b372e95ac4b82.elb.us-east-1.amazonaws.com

Initial queries returned NXDOMAIN responses. After a 15-minute interval, DNS resolution succeeded:

Non-authoritative answer:
Name:   a9a39492f4cc7459eaecbbb286f5c231-275b372e95ac4b82.elb.us-east-1.amazonaws.com
Address: 13.223.172.146
Address: 98.86.131.169
Address: 52.73.111.129

Direct IP Connectivity Validation

curl -v http://13.223.172.146

< HTTP/1.1 200 OK
< Content-Type: text/html; charset=UTF-8
< Content-Length: 5233

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Simple Node.js App - Docker & Kubernetes Practice</title>
...

Application Health Verification

# Health endpoint validation
curl http://13.223.172.146/health
{"status":"healthy","timestamp":"2025-07-29T13:59:50.216Z","uptime":1133.575010588}

# API functionality confirmation
curl http://13.223.172.146/api/info
{"name":"Simple Node.js App","version":"1.0.0","hostname":"simple-node-app-7f79555d84-v9hz5"}

Root Cause Analysis

The connectivity issue was attributed to DNS propagation latency inherent in AWS LoadBalancer provisioning. AWS Elastic Load Balancers require 10-15 minutes for global DNS propagation following initial deployment. The LoadBalancer infrastructure was functioning correctly throughout this period.

Production Architecture Overview

The final deployment architecture achieved the following configuration:

EKS Cluster: Kubernetes 1.32 running on AWS
Node Configuration: 2 t3.micro instances across multiple Availability Zones (us-east-1a, us-east-1b)
Pod Distribution: 1 application pod per node with 3 system pods per node (4/4 capacity utilization)
Load Balancing: AWS Network Load Balancer with internet-facing, multi-AZ configuration
External Access: Multiple IP endpoints with DNS resolution via AWS ELB

Key Lessons Learned

1. Git Repository Management

Always maintain proper .gitignore files
Never commit large binary files to Git
Use git filter-branch carefully to clean history

2. Docker Platform Compatibility

Always build images for the target platform
Use docker buildx for multi-platform builds
Verify image manifests before deployment

3. Kubernetes Resource Constraints

Understand node capacity limits (especially for t3.micro)
Plan replica counts based on available node slots
Monitor pod scheduling and resource utilization

4. AWS LoadBalancer Behavior

DNS propagation can take 10-15 minutes
Test with direct IPs when DNS isn't resolving
Be patient with AWS service provisioning

5. Troubleshooting Methodology

Start with the simplest tests (curl, nslookup)
Work from the inside out (pods → services → ingress)
Use kubectl describe liberally for event details

The Final Result

After all the challenges and troubleshooting, I achieved:

High Availability: 2 pods across 2 nodes in different AZs
Auto-scaling: HPA configured for traffic spikes
Load Balancing: AWS NLB distributing traffic
Resource Optimization: 100% node utilization
Production Ready: Health checks, monitoring, security

The deployment has been completed, and the application is now reachable through the load balancer endpoint, as evidenced by the screenshot provided below.

Code Repository

All the code and configurations from this journey are available in my repository:

Terraform configurations: EKS cluster setup
Kubernetes manifests: All service types and deployments
Docker configurations: Multi-platform build setup
Node.js application: Complete todo app with modern UI

Challenge 8: Complete Kubernetes Cleanup

The Need for Cleanup

After successfully testing the deployment, I needed to properly clean up all Kubernetes resources to avoid unnecessary AWS costs and maintain a clean cluster state.

The Systematic Cleanup Process

Step 1: Resource Discovery

kubectl get all -n simple-node-app

This revealed:

2 Pods (running containers)
4 Services (ClusterIP, NodePort, 2 LoadBalancers)
1 Deployment
5 ReplicaSets (from various rollouts)

Step 2: Bulk Resource Deletion

kubectl delete all --all -n simple-node-app

This efficiently removed:

All pods (terminated containers)
All services (including AWS LoadBalancers)
The deployment
Associated ReplicaSets

Step 3: Check for Additional Resources

kubectl get all,configmaps,secrets,ingress,hpa -n simple-node-app

Found remaining ConfigMaps that needed manual cleanup.

Step 4: Clean Up ConfigMaps

kubectl delete configmap simple-node-app-config -n simple-node-app

Step 5: Delete the Namespace

kubectl delete namespace simple-node-app

This ensured complete cleanup of any remaining resources.

Step 6: Verification

kubectl get namespaces | grep simple-node-app

Empty output confirmed successful cleanup.

What Was Automatically Cleaned Up

Deleting the Kubernetes services automatically triggered AWS to clean up:

2 AWS LoadBalancers (NLB and CLB)
Target Groups
LoadBalancer Security Groups
DNS entries

Cost Impact

The cleanup immediately stopped charges for:

LoadBalancer hours (~$16-18/month per LB)
Data transfer costs
Target group evaluations

The EKS cluster was also deprovisioned using Terraform destroy, as evidenced by the screenshot below.

Final Architecture Achievement

Throughout this journey, I successfully built and deployed a comprehensive EKS solution featuring:

High-Availability Cluster: 2-node EKS cluster across multiple Availability Zones
Optimized Resource Utilization: Full node capacity utilization (4/4 pods per node)
Production Load Balancing: AWS Network Load Balancer with internet-facing configuration
Scalable Architecture: Proper pod distribution and resource constraints
Complete Connectivity: External access via multiple IP endpoints and DNS resolution

The Node.js application achieved successful deployment on a production-grade AWS EKS cluster with full high availability, load balancing, and complete functionality.

Conclusion

This comprehensive EKS deployment journey demonstrated the complexities and rewards of modern cloud-native application delivery. What began as a straightforward deployment evolved into a systematic exploration of infrastructure management, container orchestration, and production troubleshooting methodologies.

The project successfully addressed multiple technical challenges:

Infrastructure as Code: Terraform-based EKS cluster provisioning and lifecycle management
Container Platform Compatibility: Resolving Docker image architecture mismatches
Resource Capacity Planning: Optimizing pod distribution within node constraints
Service Layer Design: Implementing comprehensive Kubernetes service architectures
DNS and Networking: Troubleshooting AWS LoadBalancer connectivity
Operational Excellence: Complete resource cleanup and cost management

Key technical insights gained:

Platform Architecture Alignment: Container images must match target execution environments
Resource Constraint Understanding: AWS instance types impose specific pod density limitations
DNS Propagation Patience: Cloud load balancers require time for global DNS availability
Systematic Troubleshooting: Methodical problem resolution yields reliable solutions
Infrastructure Lifecycle Management: Proper resource cleanup prevents unnecessary costs

The final architecture achieved production-ready status with high availability, optimal resource utilization, and comprehensive external connectivity. This implementation serves as a foundation for scalable, cloud-native application deployments on AWS EKS.

Deploying a Node.js Application on AWS EKS: A Technical Deep Dive

Table of contents

Introduction

Project Overview

Challenge 1: Repository Management and Version Control

Problem Analysis

Resolution Approach

Challenge 2: Infrastructure Version Management

Problem Analysis

Implementation Strategy

Challenge 3: EKS Infrastructure Provisioning

Infrastructure Architecture

Challenge 4: Container Platform Architecture Mismatch

Problem Analysis

Resolution Strategy

Challenge 5: Service Layer Architecture Design

Problem Analysis

Implementation Strategy

Challenge 6: Resource Capacity and Pod Scheduling Constraints

Problem Analysis

Root Cause Analysis

Resolution Strategy

Challenge 7: LoadBalancer DNS Resolution and Connectivity

Problem Analysis

Diagnostic Methodology

Root Cause Analysis

Production Architecture Overview

Key Lessons Learned

1. Git Repository Management

2. Docker Platform Compatibility

3. Kubernetes Resource Constraints

4. AWS LoadBalancer Behavior

5. Troubleshooting Methodology

The Final Result

Code Repository

Challenge 8: Complete Kubernetes Cleanup

The Need for Cleanup

The Systematic Cleanup Process

What Was Automatically Cleaned Up

Cost Impact

Final Architecture Achievement

Conclusion

Subscribe to my newsletter

Enoch

Enoch