Part 1: From Zero to Production β Build a Scalable Amazon EKS Cluster with Terraform


Welcome to the first article of this Amazon EKS Production-Ready Series! In this hands-on guide, weβll build a fully functional and scalable Amazon EKS (Elastic Kubernetes Service) cluster using Terraform. From configuring the network to deploying the control plane and worker nodes, this tutorial is a must-read for DevOps engineers and cloud professionals who want to set up Kubernetes in a real-world, production-grade environment.
Whether you're a DevOps engineer, cloud architect, or a Kubernetes enthusiast, this series is designed to enhance your skills and help you deploy like a proβwith confidence, automation, and scalability built in from the beginning.
𧱠Goal of This Article
In this tutorial series, we'll create a production-grade EKS cluster using Terraform. This includes:
VPC
Subnets (public and private)
NAT Gateway & Internet Gateway
Routing tables
EKS Control Plane
Node Group with IAM roles
This setup mirrors what many real-world production environments use to balance scalability, high availability, and security.
β Prerequisites
Before diving into the Terraform configuration, ensure you have the following set up:
πΉ AWS Account β with appropriate IAM permissions to create VPC, EKS, IAM roles, etc.
πΉ AWS CLI β Installed and configured using aws configure
.
πΉ Terraform CLI β Version >= 1.0
πΉ kubectl β Kubernetes command-line tool to interact with the EKS cluster.
πΉ IAM user or role β With full access to EKS, EC2, IAM, and VPC services.
Once the prerequisites are ready, you're good to go! β
π Project Structure
terraform-eks-production-cluster/
βββ 0-locals.tf # Reusable local variables (env, region, AZs, etc.)
βββ 1-providers.tf # Terraform & AWS provider configuration
βββ 2-vpc.tf # VPC resource
βββ 3-igw.tf # Internet Gateway
βββ 4-subnets.tf # Public & private subnets across 2 AZs
βββ 5-nat.tf # NAT Gateway + Elastic IP for private subnet egress
βββ 6-routes.tf # Route tables and subnet associations
βββ 7-eks.tf # EKS control plane + IAM role
βββ 8-nodes.tf # EKS managed node group + IAM role for nodes
βββ iam/
β βββ AWSLoadBalancerController.json # IAM policy for ALB controller
βββ values/
β βββ metrics-server.yaml # Helm values for Metrics Server
β βββ nginx-ingress.yaml # Helm values for NGINX Ingress
βββ .gitignore # Ignore Terraform state, .terraform, secrets, etc.
π GitHub Repository: https://github.com/neamulkabiremon/terraform-eks-production-cluster.git
π§ Step-by-Step Explanation of Each Terraform File
β
0-locals.tf
Defines centralized reusable variables:
locals {
env = "production"
region = "us-east-1"
zone1 = "us-east-1a"
zone2 = "us-east-1b"
eks_name = "demo"
eks_version = "1.30"
}
We define all reusable values here. Think of this as your centralized configuration:
env
: your environment name (e.g., staging, production)region
: AWS region to deploy resourceszone1
&zone2
: AZs for high availabilityeks_name
: cluster nameeks_version
: EKS Kubernetes version
These values are used throughout other resources to avoid duplication and support easy environment changes.
β
1-providers.tf
Specifies the AWS provider and Terraform version:
provider "aws" {
region = "us-east-1"
}
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.49"
}
}
}
Declares:
The AWS provider region (could be moved to locals)
Terraform version
AWS provider version pinning β avoids unexpected breaking changes in future versions.
This ensures your Terraform setup uses compatible versions of both AWS and Terraform.
β
2-vpc.tf
Creates a VPC with DNS support for Kubernetes:
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_support = true
enable_dns_hostnames = true
tags = {
Name = "${local.env}-main"
}
}
Creates a Virtual Private Cloud:
CIDR:
10.0.0.0/16
, big enough for multiple subnetsEnables DNS support and hostname resolution (essential for service discovery in Kubernetes).
β
3-igw.tf
Provisioning an Internet Gateway to expose public subnets:
resource "aws_internet_gateway" "igw" {
vpc_id = aws_vpc.main.id
tags = {
Name = "${local.env}-igw"
}
}
An Internet Gateway allows public subnets to access the internet. We attach it to the VPC and tag it for visibility.
β
4-subnets.tf
Creates 4 subnets: 2 private, 2 public across 2 zones:
# Sample from private_zone1
resource "aws_subnet" "private_zone1" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.0.0/19"
availability_zone = local.zone1
tags = {
"Name" = "${local.env}-private-${local.zone1}"
"kubernetes.io/role/internal-elb" = "1"
"kubernetes.io/cluster/${local.env}-${local.eks_name}" = "owned"
}
}
resource "aws_subnet" "private_zone2" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.32.0/19"
availability_zone = local.zone2
tags = {
"Name" = "${local.env}-private-${local.zone2}"
"kubernetes.io/role/internal-elb" = "1"
"kubernetes.io/cluster/${local.env}-${local.eks_name}" = "owned"
}
}
resource "aws_subnet" "public_zone1" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.64.0/19"
availability_zone = local.zone1
map_public_ip_on_launch = true
tags = {
"Name" = "${local.env}-public-${local.zone1}"
"kubernetes.io/role/elb" = "1"
"kubernetes.io/cluster/${local.env}-${local.eks_name}" = "owned"
}
}
resource "aws_subnet" "public_zone2" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.96.0/19"
availability_zone = local.zone2
map_public_ip_on_launch = true
tags = {
"Name" = "${local.env}-public-${local.zone2}"
"kubernetes.io/role/elb" = "1"
"kubernetes.io/cluster/${local.env}-${local.eks_name}" = "owned"
}
}
2 private: For worker nodes (to keep them secure)
2 public: For NAT Gateway and ingress/egress traffic
Public subnets also enable map_public_ip_on_launch = true
.
We also apply AWS & Kubernetes-specific tags:
Tags like
kubernetes.io/role/internal-elb
allow ALB controllers to auto-discover these subnets.map_public_ip_on_launch
is enabled for public subnets.
β
5-nat.tf
Adds NAT Gateway and Elastic IP for private subnet internet access:
resource "aws_eip" "nat" {
domain = "vpc"
tags = {
Name = "${local.env}-nat"
}
}
resource "aws_nat_gateway" "nat" {
allocation_id = aws_eip.nat.id
subnet_id = aws_subnet.public_zone1.id
tags = {
Name = "${local.env}-nat"
}
depends_on = [aws_internet_gateway.igw]
}
Private subnets canβt reach the internet unless we:
Allocate an Elastic IP
Create a NAT Gateway in a public subnet
This setup ensures outbound internet access without exposing workloads.
β
6-routes.tf
Defines route tables and subnet associations:
resource "aws_route_table" "private" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.nat.id
}
tags = {
Name = "${local.env}-private"
}
}
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.igw.id
}
tags = {
Name = "${local.env}-public"
}
}
resource "aws_route_table_association" "private_zone1" {
subnet_id = aws_subnet.private_zone1.id
route_table_id = aws_route_table.private.id
}
resource "aws_route_table_association" "private_zone2" {
subnet_id = aws_subnet.private_zone2.id
route_table_id = aws_route_table.private.id
}
resource "aws_route_table_association" "public_zone1" {
subnet_id = aws_subnet.public_zone1.id
route_table_id = aws_route_table.public.id
}
resource "aws_route_table_association" "public_zone2" {
subnet_id = aws_subnet.public_zone2.id
route_table_id = aws_route_table.public.id
}
Routing tables manage how subnets send traffic:
Private route table: Sends
0.0.0.0/0
to NAT GatewayPublic route table: Sends
0.0.0.0/0
to Internet GatewayEach route table is then associated with corresponding subnets.
β
7-eks.tf
Provisions the EKS cluster control plane with proper IAM role:
resource "aws_iam_role" "eks" {
name = "${local.env}-${local.eks_name}-eks-cluster"
assume_role_policy = <<POLICY
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Principal": {
"Service": "eks.amazonaws.com"
}
}
]
}
POLICY
}
resource "aws_iam_role_policy_attachment" "eks" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
role = aws_iam_role.eks.name
}
resource "aws_eks_cluster" "eks" {
name = "${local.env}-${local.eks_name}"
version = local.eks_version
role_arn = aws_iam_role.eks.arn
vpc_config {
endpoint_private_access = false
endpoint_public_access = true
subnet_ids = [
aws_subnet.private_zone1.id,
aws_subnet.private_zone2.id
]
}
access_config {
authentication_mode = "API"
bootstrap_cluster_creator_admin_permissions = true
}
depends_on = [ aws_iam_role_policy_attachment.eks ]
}
IAM role allows eks.amazonaws.com
to assume the role.
This creates the EKS Control Plane:
We create an IAM role with a trust policy allowing
eks.amazonaws.com
to assume itEKS cluster is created in private subnets
access_config
enables API access and bootstraps the creator as an admin
β
8-nodes.tf
Creates a managed EKS node group with correct IAM roles:
resource "aws_iam_role" "nodes" {
name = "${local.env}-${local.eks_name}-eks-nodes"
assume_role_policy = <<POLICY
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Principal": {
"Service": "ec2.amazonaws.com"
}
}
]
}
POLICY
}
# This policy now includes AssumeRoleForPodIdentity for the Pod Identity Agent
resource "aws_iam_role_policy_attachment" "amazon_eks_worker_node_policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
role = aws_iam_role.nodes.name
}
resource "aws_iam_role_policy_attachment" "amazon_eks_cni_policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
role = aws_iam_role.nodes.name
}
resource "aws_iam_role_policy_attachment" "amazon_ec2_container_registry_read_only" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
role = aws_iam_role.nodes.name
}
resource "aws_eks_node_group" "general" {
cluster_name = aws_eks_cluster.eks.name
version = local.eks_version
node_group_name = "general"
node_role_arn = aws_iam_role.nodes.arn
subnet_ids = [
aws_subnet.private_zone1.id,
aws_subnet.private_zone2.id
]
capacity_type = "ON_DEMAND"
instance_types = ["t3.medium"]
scaling_config {
desired_size = 1
max_size = 10
min_size = 0
}
update_config {
max_unavailable = 1
}
labels = {
role = "general"
}
depends_on = [
aws_iam_role_policy_attachment.amazon_eks_worker_node_policy,
aws_iam_role_policy_attachment.amazon_eks_cni_policy,
aws_iam_role_policy_attachment.amazon_ec2_container_registry_read_only,
]
# Allow external changes without Terraform plan difference
lifecycle {
ignore_changes = [scaling_config[0].desired_size]
}
}
Here we define the EKS Node Group:
Node IAM Role is assumed by EC2 instances
IAM policies allow:
Worker node management
CNI networking
Pulling images from ECR
Nodes are deployed in private subnets with desired/min/max scaling configs
Labels help with grouping nodes by role (like
general
,app
,
π§ͺ Validate & Apply the Terraform Configuration
Run the following commands:
terraform init
terraform apply -auto-approve
Terraform will create the infrastructure, and it may take some time. In my case, it took 15 minutes to provision.
π Authenticate the Cluster
Once created, authenticate and test using AWS CLI:
aws eks update-kubeconfig --region us-east-1 --name production-demo
kubectl get nodes
If nodes are listed, your EKS cluster is running π
βοΈ Whatβs Next?
This is just Day 1 of our series. In the upcoming days, weβll enhance this cluster and cover:
Role-Based Access Control (RBAC)
Deploying the AWS ALB Ingress Controller
Setting up the Ingress Controller with NGINX
Enabling Cluster Autoscaling
Configuring Persistent Volume Claims (PVC)
Managing Secrets securely
TLS Certificates using Cert-Manager
π Follow me to stay updated and get notified when the next article is published!
#EKS #Terraform #AWS #DevOps #Kubernetes #InfrastructureAsCode #CloudEngineering #CI_CD #IaC #Observability
Subscribe to my newsletter
Read articles from Neamul Kabir Emon directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Neamul Kabir Emon
Neamul Kabir Emon
Hi! I'm a highly motivated Security and DevOps professional with 7+ years of combined experience. My expertise bridges penetration testing and DevOps engineering, allowing me to deliver a comprehensive security approach.