Deploy a Kubernetes cluster on GKE

Google Kubernetes Engine (GKE) is a powerful, managed Kubernetes service that allows you to deploy containerized applications at scale. In this blog, weβll walk through how to provision a GKE cluster using Terraform, an infrastructure-as-code (IaC) tool that helps automate the entire process.
Whether you're a DevOps engineer, cloud enthusiast, or just starting out, this guide will give you hands-on knowledge of deploying production-ready infrastructure on GCP.
Deploying a GKE cluster using Terraform is a clean and reusable way to manage infra as code.
Here's a full step-by-step guide using Terraform. ( we will perform everything is Google Cloud Shell)
β Prerequisites
Google Cloud account with billing enabled
Terraform installed ( install it in Google cloud shell)
gcloud
CLI installed & configured. and Enable the APIs ( Compute Engine, Service usage, Kubernetes)gcloud version #command to check gcloud is installed or not #if it is installed you will see this type of output Google Cloud SDK 456.0.0 bq 2.0.84 core 2023.04.01 gcloud 2023.04.01 kubectl 1.28.1 #if you get command not found, itβs not installed. #Install it from: https://cloud.google.com/sdk/docs/install gcloud auth login gcloud auth list gcloud config list gcloud config set project <your-project-id> # you should see output something like the below one [core] account = your.email@gmail.com project = your-gcp-project-id
If the project isn't set, run:
gcloud config set project <your-gcp-project-id>
A service account with permissions (
Kubernetes Engine Admin
,Compute Admin
, etc.)Create a Service Account:
gcloud iam service-accounts create terraform-gke \ --description="Service account for Terraform to manage GKE" \ --display-name="Terraform GKE Admin" # Below code is to Grant IAM Roles (<your-project-id> replace it with your project id) gcloud projects add-iam-policy-binding <your-project-id> \ --member="serviceAccount:terraform-gke@<your-project-id>.iam.gserviceaccount.com" \ --role="roles/container.admin" gcloud projects add-iam-policy-binding <your-project-id> \ --member="serviceAccount:terraform-gke@<your-project-id>.iam.gserviceaccount.com" \ --role="roles/compute.admin" gcloud projects add-iam-policy-binding <your-project-id> \ --member="serviceAccount:terraform-gke@<your-project-id>.iam.gserviceaccount.com" \ --role="roles/iam.serviceAccountUser"
π Above roles allow Terraform to:
Create/manage GKE clusters
Manage networking and compute
Use the service account to impersonate itself
Create and Download a JSON Key:
gcloud iam service-accounts keys create ~/terraform-gke-key.json \
--iam-account terraform-gke@<your-project-id>.iam.gserviceaccount.com
This JSON file is your service account key β keep it safe! π
Export Credentials for Terraform
Before running Terraform:
export GOOGLE_APPLICATION_CREDENTIALS=~/terraform-gke-key.json
ποΈ Project Structure
mkdir gke-cluster && cd gke-cluster
touch main.tf variables.tf outputs.tf terraform.tfvars
gke-cluster/ #folder
βββ main.tf
βββ variables.tf
βββ outputs.tf
βββ terraform.tfvars
π main.tf
provider "google" {
project = var.project_id
region = var.region #
region = var.zone # use this if you face quota issue
}
resource "google_container_cluster" "primary" {
name = var.cluster_name
location = var.region
location = var.zone # use this if you face quota issue
remove_default_node_pool = true
initial_node_count = 1
networking_mode = "VPC_NATIVE"
}
resource "google_container_node_pool" "primary_nodes" {
name = "primary-node-pool"
location = var.region
location = var.zone # use this if you face quota issue
cluster = google_container_cluster.primary.name
node_config {
machine_type = "e2-medium"
disk_size_gb = 50 # π Add this line to reduce disk usage
oauth_scopes = [
"https://www.googleapis.com/auth/cloud-platform",
]
}
initial_node_count = 1
}
π§ Why This Works
GCP by default allocates 100 GB SSD per node
You didnβt override that, so you're getting the default quota usage
Even with one node, if your cluster is regional, GKE might try to deploy across three zones within the region. So:
1 node pool Γ 1 node Γ 3 zones Γ 100 GB = 300 GB
That's why you're seeing a request for 300 GB even if you're provisioning just one node.
Fix: Deploy in a Zonal Cluster Instead of Regional : which is
location = var.region »»» location = var.region
( go and check in the error sectionfor more details)
π variables.tf
variable "project_id" {
type = string
}
variable "region" {
type = string
default = "ap-south1"
}
variable "zone" {
type = string
default = "ap-south1-a"
}
variable "cluster_name" {
type = string
default = "my-gke-cluster"
}
π terraform.tfvars
hclCopyEditproject_id = "your-gcp-project-id"
region = "ap-south1"
zone = "ap-south1-a"
cluster_name = "my-gke-cluster"
π outputs.tf
hclCopyEditoutput "cluster_name" {
value = google_container_cluster.primary.name
}
output "kubeconfig_command" {
value = "gcloud container clusters get-credentials ${google_container_cluster.primary.name} --region ${var.region}"
}
π Steps to Deploy
β 1. Authenticate
Make sure you're authenticated with GCP and have set the project:
gcloud auth application-default login
β 2. Initialize & Apply Terraform
terraform init
terraform plan
terraform apply -auto-approve
β
3. Configure kubectl
After deployment finishes, run the output command:
gcloud container clusters get-credentials my-gke-cluster --region ap-south1
kubectl get nodes
Boom β you're connected to your GKE cluster! π
β 4. Deploy Something
kubectl create deployment nginx --image=nginx
kubectl expose deployment nginx --type=LoadBalancer --port=80
kubectl get svc
π§Ή To Destroy:
terraform destroy -auto-approve
Error you May Face
ERROR: (gcloud.container.clusters.list) ResponseError: code=403, message=Kubernetes Engine API has not been used in project before or it is disabled.
Means: The service account has the right permissions, but the GKE API is still disabled, so it can't do anything
β How to Fix It
Go to this link in your browser (from the error message):
π Enable Kubernetes Engine APIClick "Enable" (top of the page).
Wait 1β2 minutes for it to fully activate.
Error 403: Insufficient regional quota to satisfy request: resource "SSD_TOTAL_GB": request requires '300.0' and is short '50.0'. project has a quota of '250.0' with '250.0' available.
Means: GKE cluster creation is requesting 300 GB of SSD disk, but your project only has a quota of 250 GB in that region.
β How to Fix It
go to main.tf
You can manually control the disk size by adding
disk_size_gb
to thenode_config
block in yourgoogle_container_node_pool
resource.disk_size_gb = 50 # π Add this line to reduce disk usage
Error: Cannot destroy cluster because deletion_protection is set to true. Set it to false to proceed with cluster deletion.
Means: This means your GKE cluster has deletion protection enabled, which prevents Terraform (or anyone) from accidentally deleting the cluster.
β How to Fix It
Solution: Disable deletion_protection
in Terraform
Just update your Terraform config for the cluster to explicitly disable deletion protection.
π§ In google_container_cluster
block, add:
resource "google_container_cluster" "primary" {
name = var.cluster_name
location = var.zone
remove_default_node_pool = true
initial_node_count = 1
networking_mode = "VPC_NATIVE"
deletion_protection = false # π Add this line
}
Then do:
terraform apply
This will update the cluster and turn off deletion protection.
Steps to Access the nginx
App (LoadBalancer Service)
π 1. Get the External IP
Run:
kubectl get svc
You should see something like:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nginx LoadBalancer 10.0.12.45 <pending> 80:32345/TCP 1m
β³ If the EXTERNAL-IP
is still <pending>
, it means GCP is still provisioning the external load balancer (can take 1β3 minutes).
β
2. Once EXTERNAL-IP
Is Ready
Example:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nginx LoadBalancer 10.0.12.45 34.133.45.23 80:32345/TCP 2m
Now you can access your NGINX app in the browser:
β‘οΈ http://34.133.45.23
β Final Thoughts
Using Terraform to deploy GKE clusters allows you to manage Kubernetes infrastructure declaratively, reproducibly, and at scale. With just a few configuration files, you can spin up or destroy entire clusters, which is especially powerful for CI/CD pipelines and infrastructure automation.
Subscribe to my newsletter
Read articles from Aditya Khadanga directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Aditya Khadanga
Aditya Khadanga
A DevOps practitioner dedicated to sharing practical knowledge. Expect in-depth tutorials and clear explanations of DevOps concepts, from fundamentals to advanced techniques. Join me on this journey of continuous learning and improvement!