Building a Production-Ready GKE Cluster with Terraform, Helm & Secure Kubernetes Practices

πŸ“Œ Overview

In this guide, I’ll walk you through the process of creating a production-grade GKE (Google Kubernetes Engine) cluster using Terraform, Helm, and Kubernetes YAML manifests. This architecture is designed with best practices in mindβ€”offering scalability, security, and automation out of the box.

We’ll deploy a full-stack infrastructure that includes a private GKE cluster, secure VPC networking, TLS with Let's Encrypt, NGINX Ingress, automatic DNS management, and secret handling via Google Secret Manager + CSI Driver. All infrastructure is managed through modular, reusable Terraform code, making the entire deployment automated and repeatable.

πŸ“¦ GitHub Repository: https://github.com/neamulkabiremon/terraform-gke-cluster.git

Whether you're building a microservices platform, a SaaS product, or deploying containerized workloads in the cloud, this setup is your foundation for running production workloads reliably on Google Cloud.

βœ… Private GKE Cluster
βœ… Secure VPC & NAT
βœ… NGINX Ingress Controller + TLS
βœ… External DNS with Google Cloud DNS
βœ… Workload Identity for Secure SA Usage
βœ… Secrets Store CSI Driver with Secret Manager
βœ… Infrastructure as Code using Terraform


πŸ“‚ Project Structure

terraform-gke/
β”œβ”€β”€ 0-locals.tf                     # Project ID, region, enabled APIs
β”œβ”€β”€ 1-providers.tf                 # Terraform provider configurations
β”œβ”€β”€ 2-apis.tf                      # Enable required GCP APIs
β”œβ”€β”€ 3-vpc.tf                       # VPC network definition
β”œβ”€β”€ 4-subnets.tf                   # Public and private subnets
β”œβ”€β”€ 5-nat.tf                       # Cloud NAT setup for private subnet
β”œβ”€β”€ 6-firewalls.tf                 # Firewall rules
β”œβ”€β”€ 7-gke.tf                       # GKE cluster resource
β”œβ”€β”€ 8-gke-nodes.tf                 # Node pool and SA bindings
β”œβ”€β”€ 9-nginx-ingress.tf             # NGINX Ingress via Helm
β”œβ”€β”€ 10-cert-manager.tf             # cert-manager for TLS via Helm
β”œβ”€β”€ kubernetes/
β”‚   β”œβ”€β”€ autoscaling/               # HPA and VPA manifests
β”‚   β”‚   β”œβ”€β”€ HorizontalPodAutoscaler.yaml
β”‚   β”‚   └── VerticalPodAutoscaler.yaml
β”‚   β”œβ”€β”€ cert-manager/             # ClusterIssuer config
β”‚   β”‚   └── cluster-issuer-prod.yaml
β”‚   β”œβ”€β”€ deployment/               # App manifests
β”‚   β”‚   └── crypto-app/

Terraform Setup:

0-locals.tf

locals {
  project_id = "serious-physics-452107-d1"
  region     = "us-central1"
  apis = [
    "compute.googleapis.com",
    "container.googleapis.com",
    "logging.googleapis.com",
    "secretmanager.googleapis.com"
  ]
}

1-providers.tf

provider "google" {
  project = local.project_id
  region  = local.region
}

terraform {
  required_version = ">= 1.0"

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 6.0"
    }
  }
}

2-apis.tf

# resource "google_project_service" "compute" {
#   service = "compute.googleapis.com"

#   disable_on_destroy = false
# }

# resource "google_project_service" "container" {
#   service = "container.googleapis.com"

#   disable_on_destroy = false
# }

# resource "google_project_service" "logging" {
#   service = "logging.googleapis.com"

#   disable_on_destroy = false
# }

# resource "google_project_service" "secretmanager" {
#   service = "secretmanager.googleapis.com"

#   disable_on_destroy = false
# }

resource "google_project_service" "api" {
  for_each = toset(local.apis)
  service  = each.key

  disable_on_destroy = false
}

3-vpc.tf

resource "google_compute_network" "vpc" {
  name                            = "main"
  routing_mode                    = "REGIONAL"
  auto_create_subnetworks         = false
  delete_default_routes_on_create = true

  depends_on = [google_project_service.api]
}

# Remove this route to make the VPC fully private.
# You need this route for the NAT gateway.
resource "google_compute_route" "default_route" {
  name             = "default-route"
  dest_range       = "0.0.0.0/0"
  network          = google_compute_network.vpc.name
  next_hop_gateway = "default-internet-gateway"
}

4-subnets.tf

resource "google_compute_subnetwork" "public" {
  name                     = "public"
  ip_cidr_range            = "10.0.0.0/19"
  region                   = local.region
  network                  = google_compute_network.vpc.id
  private_ip_google_access = true
  stack_type               = "IPV4_ONLY"
}

resource "google_compute_subnetwork" "private" {
  name                     = "private"
  ip_cidr_range            = "10.0.32.0/19"
  region                   = local.region
  network                  = google_compute_network.vpc.id
  private_ip_google_access = true
  stack_type               = "IPV4_ONLY"

  secondary_ip_range {
    range_name    = "k8s-pods"
    ip_cidr_range = "172.16.0.0/14"
  }

  secondary_ip_range {
    range_name    = "k8s-services"
    ip_cidr_range = "172.20.0.0/18"
  }
}

5-nat.tf

resource "google_compute_address" "nat" {
  name         = "nat"
  address_type = "EXTERNAL"
  network_tier = "PREMIUM"

  depends_on = [google_project_service.api]
}

resource "google_compute_router" "router" {
  name    = "router"
  region  = local.region
  network = google_compute_network.vpc.id
}

resource "google_compute_router_nat" "nat" {
  name   = "nat"
  region = local.region
  router = google_compute_router.router.name

  nat_ip_allocate_option             = "MANUAL_ONLY"
  source_subnetwork_ip_ranges_to_nat = "LIST_OF_SUBNETWORKS"
  nat_ips                            = [google_compute_address.nat.self_link]

  subnetwork {
    name                    = google_compute_subnetwork.private.self_link
    source_ip_ranges_to_nat = ["ALL_IP_RANGES"]
  }
}

6-firewalls.tf

resource "google_compute_firewall" "allow_iap_ssh" {
  name    = "allow-iap-ssh"
  network = google_compute_network.vpc.name

  allow {
    protocol = "tcp"
    ports    = ["22"]
  }

  source_ranges = ["35.235.240.0/20"]
}

7-gke.tf

resource "google_container_cluster" "gke" {
  name                     = "gcp-devops-project"
  location                 = "us-central1-a"
  remove_default_node_pool = true
  initial_node_count       = 1
  network                  = google_compute_network.vpc.self_link
  subnetwork               = google_compute_subnetwork.private.self_link
  networking_mode          = "VPC_NATIVE"

  deletion_protection = false

  # Optional, if you want multi-zonal cluster
  # node_locations = ["us-central1-b"]

  addons_config {
    http_load_balancing {
      disabled = true
    }
    horizontal_pod_autoscaling {
      disabled = false
    }
  }

  release_channel {
    channel = "REGULAR"
  }

  workload_identity_config {
    workload_pool = "${local.project_id}.svc.id.goog"
  }

  ip_allocation_policy {
    cluster_secondary_range_name  = "k8s-pods"
    services_secondary_range_name = "k8s-services"
  }

  private_cluster_config {
    enable_private_nodes    = true
    enable_private_endpoint = false
    master_ipv4_cidr_block  = "192.168.0.0/28"
  }

  # Jenkins use case
  # master_authorized_networks_config {
  #   cidr_blocks {
  #     cidr_block   = "10.0.0.0/18"
  #     display_name = "private-subnet"
  #   }
  # }
}

8-gke-nodes.tf

resource "google_service_account" "gke" {
  account_id = "demo-gke"
}

resource "google_project_iam_member" "gke_logging" {
  project = local.project_id
  role    = "roles/logging.logWriter"
  member  = "serviceAccount:${google_service_account.gke.email}"
}

resource "google_project_iam_member" "gke_metrics" {
  project = local.project_id
  role    = "roles/monitoring.metricWriter"
  member  = "serviceAccount:${google_service_account.gke.email}"
}

resource "google_container_node_pool" "general" {
  name    = "general"
  cluster = google_container_cluster.gke.id

  autoscaling {
    total_min_node_count = 1
    total_max_node_count = 5
  }

  management {
    auto_repair  = true
    auto_upgrade = true
  }

  node_config {
    preemptible  = false
    machine_type = "e2-medium"

    labels = {
      role = "general"
    }

    # taint {
    #   key    = "instance_type"
    #   value  = "spot"
    #   effect = "NO_SCHEDULE"
    # }

    service_account = google_service_account.gke.email
    oauth_scopes = [
      "https://www.googleapis.com/auth/cloud-platform"
    ]
  }
}

9-nginx-ingress.tf

# Required for Terraform to connect to your GKE cluster
data "google_client_config" "default" {}

provider "helm" {
  kubernetes {
    host                   = google_container_cluster.gke.endpoint
    token                  = data.google_client_config.default.access_token
    cluster_ca_certificate = base64decode(google_container_cluster.gke.master_auth[0].cluster_ca_certificate)
  }
}

resource "helm_release" "nginx_ingress" {
  name       = "ingress-nginx"
  repository = "https://kubernetes.github.io/ingress-nginx"
  chart      = "ingress-nginx"
  namespace  = "ingress-nginx"
  create_namespace = true

  set {
    name  = "controller.publishService.enabled"
    value = "true"
  }
}

10-cert-manager.tf

resource "helm_release" "cert_manager" {
  name       = "cert-manager"
  repository = "https://charts.jetstack.io"
  chart      = "cert-manager"
  namespace  = "cert-manager"
  create_namespace = true

  set {
    name  = "installCRDs"
    value = "true"
  }

  depends_on = [helm_release.nginx_ingress]
}

Terraform Cluster Setup

  1. Initialize Terraform:

     terraform init
    

  1. Provision the Full Infrastructure:

terraform apply -auto-approve

⚠️ This process takes approximately 10–20 minutes. Let it complete without interruption.

  1. Authenticate with GKE: After creation, get GKE credentials:
gcloud container clusters get-credentials gcp-devops-project \
  --zone us-central1-a \
  --project serious-physics-452107-d1

Verify the cluster is ready:

kubectl get nodes

The GKE cluster creation is successfully completed. Now, we need to create the Kubernetes manifest.

πŸ” ClusterIssuer for Let's Encrypt

Define the ClusterIssuer at kubernetes/cert-manager/cluster-issuer-prod.yaml:

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    email: cryptoapp@neamulkabiremon.com
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
      - http01:
          ingress:
            class: nginx

Apply the ClusterIssuer:

kubectl apply -f kubernetes/cert-manager/cluster-issuer-prod.yaml

πŸ“¦ Deploy the Crypto App

Create the folder structure:

mkdir -p kubernetes/deployment/crypto-app

deployment.yaml

apiVersion: v1
kind: Namespace
metadata:
  name: development
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: crypto-app
  name: crypto-app
  namespace: development
spec:
  replicas: 2
  selector:
    matchLabels:
      app: crypto-app
  strategy: {}
  template:
    metadata:
      labels:
        app: crypto-app
    spec:
      containers:
      - image: neamulkabiremon/cryptoapp:latest
        imagePullPolicy: Always
        name: crypto-app
        ports:
        - containerPort: 5000
          name: http
          protocol: TCP

service.yaml

---
apiVersion: v1
kind: Service
metadata:
  name: crypto-service
  labels:
    app: crypto-app
  namespace: development
spec:
  selector:
    app: crypto-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 5000
  type: NodePort

ingress.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: crypto-ingress
  namespace: development
  annotations:
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
    external-dns.alpha.kubernetes.io/hostname: cryptoapp.neamulkabiremon.com
spec:
  ingressClassName: nginx
  tls:
    - hosts:
        - cryptoapp.neamulkabiremon.com
      secretName: crypto-cert
  rules:
    - host: cryptoapp.neamulkabiremon.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: crypto-service
                port:
                  number: 80

πŸš€ Apply the Resources

kubectl apply -f kubernetes/deployment/crypto-app/deployment.yaml
kubectl apply -f kubernetes/deployment/crypto-app/service.yaml
kubectl apply -f kubernetes/deployment/crypto-app/ingress.yaml

Wait a few minutes for the LoadBalancer IP to be assigned:

kubectl get ingress -n development

Copy the ADDRESS-IP and use it in your DNS configuration.

🌐 Configure DNS A Record

Go to your DNS provider (e.g., AWS Route 53):

  1. Navigate to Hosted Zones β†’ neamulkabiremon.com

  2. Create a new A Record:

    • Name: cryptoapp.neamulkabiremon.com

    • Type: A

    • Value: <Ingress External IP> from previous step

Wait for 1–5 minutes for DNS propagation.

βœ… TLS + Ingress Validation

Open your browser and visit:

https://cryptoapp.neamulkabiremon.com/

🟒 You should see your secure crypto app’s login page served over HTTPS, protected by a valid Let's Encrypt certificate.

You can now login using:

  • Username: admin

  • Password: password123

🎯 Your GKE app is now live, secured, and fully production-ready.

πŸŽ‰ Final Result

βœ… Application is live at: https://cryptoapp.neamulkabiremon.com
βœ… Let's Encrypt TLS certificate successfully issued
βœ… Ingress Controller is securely routing traffic
βœ… DNS automatically managed via ExternalDNS

You're now running a secure, production-ready GKE appβ€”backed by automation, scalability, and proper TLS/DNS hygiene.

🧠 Conclusion

In this project, you’ve built a secure and production-ready Kubernetes environment using GKE. From a private cluster setup to automated TLS certificates and DNS records, every component was provisioned as code. You also integrated advanced features like ExternalDNS and Secrets Store CSI Driver with Google Secret Manager.

This foundation enables you to:

βœ… Ship updates with confidence
βœ… Secure sensitive data natively in Kubernetes
βœ… Maintain infrastructure consistency using Terraform
βœ… Scale and operate cloud-native apps with ease

You're now equipped with a battle-tested GKE deployment pipelineβ€”ideal for production environments from day one.


1
Subscribe to my newsletter

Read articles from Neamul Kabir Emon directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Neamul Kabir Emon
Neamul Kabir Emon

Hi! I'm a highly motivated Security and DevOps professional with 7+ years of combined experience. My expertise bridges penetration testing and DevOps engineering, allowing me to deliver a comprehensive security approach.