Google Kubernetes Engine (GKE) is a powerful, managed Kubernetes service that allows you to deploy containerized applications at scale. In this blog, we’ll walk through how to provision a GKE cluster using Terraform, an infrastructure-as-code (IaC) tool that helps automate the entire process.

Whether you're a DevOps engineer, cloud enthusiast, or just starting out, this guide will give you hands-on knowledge of deploying production-ready infrastructure on GCP.

Deploying a GKE cluster using Terraform is a clean and reusable way to manage infra as code.

Here's a full step-by-step guide using Terraform. ( we will perform everything is Google Cloud Shell)

✅ Prerequisites

Google Cloud account with billing enabled
Terraform installed ( install it in Google cloud shell)

gcloud CLI installed & configured. and Enable the APIs ( Compute Engine, Service usage, Kubernetes)

 gcloud version #command to check gcloud is installed or not

 #if it is installed you will see this type of output
 Google Cloud SDK 456.0.0
 bq 2.0.84
 core 2023.04.01
 gcloud 2023.04.01
 kubectl 1.28.1

 #if you get command not found, it’s not installed.
 #Install it from: https://cloud.google.com/sdk/docs/install

 gcloud auth login
 gcloud auth list
 gcloud config list
 gcloud config set project <your-project-id>

 # you should see output something like the below one
 [core]
 account = your.email@gmail.com
 project = your-gcp-project-id

If the project isn't set, run:

 gcloud config set project <your-gcp-project-id>

A service account with permissions (Kubernetes Engine Admin, Compute Admin, etc.)

Create a Service Account:

 gcloud iam service-accounts create terraform-gke \
   --description="Service account for Terraform to manage GKE" \
   --display-name="Terraform GKE Admin"

 # Below code is to Grant IAM Roles (<your-project-id> replace it with your project id)

 gcloud projects add-iam-policy-binding <your-project-id> \
   --member="serviceAccount:terraform-gke@<your-project-id>.iam.gserviceaccount.com" \
   --role="roles/container.admin"

 gcloud projects add-iam-policy-binding <your-project-id> \
   --member="serviceAccount:terraform-gke@<your-project-id>.iam.gserviceaccount.com" \
   --role="roles/compute.admin"

 gcloud projects add-iam-policy-binding <your-project-id> \
   --member="serviceAccount:terraform-gke@<your-project-id>.iam.gserviceaccount.com" \
   --role="roles/iam.serviceAccountUser"

📌 Above roles allow Terraform to:

Create/manage GKE clusters
Manage networking and compute
Use the service account to impersonate itself

Create and Download a JSON Key:

gcloud iam service-accounts keys create ~/terraform-gke-key.json \
  --iam-account terraform-gke@<your-project-id>.iam.gserviceaccount.com

This JSON file is your service account key — keep it safe! 🔐

Export Credentials for Terraform

Before running Terraform:

export GOOGLE_APPLICATION_CREDENTIALS=~/terraform-gke-key.json

🗂️ Project Structure

mkdir gke-cluster && cd gke-cluster
touch main.tf variables.tf outputs.tf terraform.tfvars

gke-cluster/   #folder 
├── main.tf
├── variables.tf
├── outputs.tf
├── terraform.tfvars

📄 `main.tf`

provider "google" {
  project = var.project_id
  region  = var.region   # 
  region  = var.zone     # use this if you face quota issue
}

resource "google_container_cluster" "primary" {
  name     = var.cluster_name
  location = var.region
  location = var.zone # use this if you face quota issue

  remove_default_node_pool = true
  initial_node_count       = 1

  networking_mode = "VPC_NATIVE"
}

resource "google_container_node_pool" "primary_nodes" {
  name       = "primary-node-pool"
  location   = var.region
  location   = var.zone # use this if you face quota issue
  cluster    = google_container_cluster.primary.name

  node_config {
    machine_type = "e2-medium"
    disk_size_gb  = 50               # 👈 Add this line to reduce disk usage
    oauth_scopes = [
      "https://www.googleapis.com/auth/cloud-platform",
    ]
  }

  initial_node_count = 1
}

🧠 Why This Works

GCP by default allocates 100 GB SSD per node
You didn’t override that, so you're getting the default quota usage
Even with one node, if your cluster is regional, GKE might try to deploy across three zones within the region. So:

1 node pool × 1 node × 3 zones × 100 GB = 300 GB

That's why you're seeing a request for 300 GB even if you're provisioning just one node.

Fix: Deploy in a Zonal Cluster Instead of Regional : which is

location = var.region »»» location = var.region

( go and check in the error sectionfor more details)

📄 `variables.tf`

variable "project_id" {
  type = string
}

variable "region" {
  type    = string
  default = "ap-south1"
}

variable "zone" {
  type    = string
  default = "ap-south1-a"
}

variable "cluster_name" {
  type    = string
  default = "my-gke-cluster"
}

📄 `terraform.tfvars`

hclCopyEditproject_id   = "your-gcp-project-id"
region       = "ap-south1"
zone         = "ap-south1-a"
cluster_name = "my-gke-cluster"

📄 `outputs.tf`

hclCopyEditoutput "cluster_name" {
  value = google_container_cluster.primary.name
}

output "kubeconfig_command" {
  value = "gcloud container clusters get-credentials ${google_container_cluster.primary.name} --region ${var.region}"
}

🚀 Steps to Deploy

✅ 1. Authenticate

Make sure you're authenticated with GCP and have set the project:

gcloud auth application-default login

✅ 2. Initialize & Apply Terraform

terraform init
terraform plan
terraform apply -auto-approve

✅ 3. Configure `kubectl`

After deployment finishes, run the output command:

gcloud container clusters get-credentials my-gke-cluster --region ap-south1
kubectl get nodes

Boom — you're connected to your GKE cluster! 🎉

✅ 4. Deploy Something

kubectl create deployment nginx --image=nginx
kubectl expose deployment nginx --type=LoadBalancer --port=80
kubectl get svc

🧹 To Destroy:

terraform destroy -auto-approve

Error you May Face

ERROR: (gcloud.container.clusters.list) ResponseError: code=403, message=Kubernetes Engine API has not been used in project before or it is disabled.

Means: The service account has the right permissions, but the GKE API is still disabled, so it can't do anything

✅ How to Fix It

Go to this link in your browser (from the error message):
👉 Enable Kubernetes Engine API
Click "Enable" (top of the page).
Wait 1–2 minutes for it to fully activate.

Error 403: Insufficient regional quota to satisfy request: resource "SSD_TOTAL_GB": request requires '300.0' and is short '50.0'. project has a quota of '250.0' with '250.0' available.

Means: GKE cluster creation is requesting 300 GB of SSD disk, but your project only has a quota of 250 GB in that region.

✅ How to Fix It

go to main.tf
You can manually control the disk size by adding disk_size_gb to the node_config block in your google_container_node_pool resource.

disk_size_gb = 50 # 👈 Add this line to reduce disk usage

Error: Cannot destroy cluster because deletion_protection is set to true. Set it to false to proceed with cluster deletion.

Means: This means your GKE cluster has deletion protection enabled, which prevents Terraform (or anyone) from accidentally deleting the cluster.

✅ How to Fix It

Solution: Disable deletion_protection in Terraform

Just update your Terraform config for the cluster to explicitly disable deletion protection.

🔧 In `google_container_cluster` block, add:

resource "google_container_cluster" "primary" {
  name     = var.cluster_name
  location = var.zone

  remove_default_node_pool = true
  initial_node_count       = 1

  networking_mode = "VPC_NATIVE"
  deletion_protection = false  # 👈 Add this line
}

Then do:

terraform apply

This will update the cluster and turn off deletion protection.

Steps to Access the `nginx` App (LoadBalancer Service)

🔍 1. Get the External IP

Run:

kubectl get svc

You should see something like:

NAME         TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)        AGE
nginx        LoadBalancer   10.0.12.45     <pending>        80:32345/TCP   1m

⏳ If the EXTERNAL-IP is still <pending>, it means GCP is still provisioning the external load balancer (can take 1–3 minutes).

✅ 2. Once `EXTERNAL-IP` Is Ready

Example:

NAME         TYPE           CLUSTER-IP     EXTERNAL-IP       PORT(S)        AGE
nginx        LoadBalancer   10.0.12.45     34.133.45.23       80:32345/TCP   2m

Now you can access your NGINX app in the browser:

➡️ http://34.133.45.23

✅ Final Thoughts

Using Terraform to deploy GKE clusters allows you to manage Kubernetes infrastructure declaratively, reproducibly, and at scale. With just a few configuration files, you can spin up or destroy entire clusters, which is especially powerful for CI/CD pipelines and infrastructure automation.

Deploy a Kubernetes cluster on GKE

Table of contents

✅ Prerequisites

Export Credentials for Terraform

🗂️ Project Structure

📄 `main.tf`

🧠 Why This Works

📄 `variables.tf`

📄 `terraform.tfvars`

📄 `outputs.tf`

🚀 Steps to Deploy

✅ 1. Authenticate

✅ 2. Initialize & Apply Terraform

✅ 3. Configure `kubectl`

✅ 4. Deploy Something

🧹 To Destroy:

Error you May Face

✅ How to Fix It

✅ How to Fix It

✅ How to Fix It

🔧 In `google_container_cluster` block, add:

Steps to Access the `nginx` App (LoadBalancer Service)

🔍 1. Get the External IP

✅ 2. Once `EXTERNAL-IP` Is Ready

✅ Final Thoughts

Subscribe to my newsletter

Aditya Khadanga

Aditya Khadanga

Deploy a Kubernetes cluster on GKE

Table of contents

✅ Prerequisites

Export Credentials for Terraform

🗂️ Project Structure

📄 main.tf

🧠 Why This Works

📄 variables.tf

📄 terraform.tfvars

📄 outputs.tf

🚀 Steps to Deploy

✅ 1. Authenticate

✅ 2. Initialize & Apply Terraform

✅ 3. Configure kubectl

✅ 4. Deploy Something

🧹 To Destroy:

Error you May Face

✅ How to Fix It

✅ How to Fix It

✅ How to Fix It

🔧 In google_container_cluster block, add:

Steps to Access the nginx App (LoadBalancer Service)

🔍 1. Get the External IP

✅ 2. Once EXTERNAL-IP Is Ready

✅ Final Thoughts

Subscribe to my newsletter

Aditya Khadanga

Aditya Khadanga

📄 `main.tf`

📄 `variables.tf`

📄 `terraform.tfvars`

📄 `outputs.tf`

✅ 3. Configure `kubectl`

🔧 In `google_container_cluster` block, add:

Steps to Access the `nginx` App (LoadBalancer Service)

✅ 2. Once `EXTERNAL-IP` Is Ready