The Complete Terraform Guide


Table of Contents:
Introduction
What is Terraform?
Why use Infrastructure as Code (IaC)?
Terraform vs. Other IaC tools (CloudFormation, Pulumi, Ansible)
Installation & Setup
Installing Terraform (Windows, macOS, Linux)
Verifying Installation
Setting Up IDE (VSCode + Terraform extension)
Creating First Terraform Project (Hello World)
Terraform Basics
Providers
What are Providers?
Popular Providers (AWS, GCP, Azure, DigitalOcean)
Resources
Creating, updating, and destroying resources
Resource Lifecycle (Create/Update/Delete)
Data Sources
- Retrieving Information from External Resources
Terraform Configuration Syntax (HCL)
Syntax basics & structure
Variables & Input variables
Output variables
Locals
Expressions, Conditionals, and Functions
Terraform Formatting (
terraform fmt
)
Terraform State Management
What is State?
Local vs. Remote state
Popular Remote State Backends:
AWS S3 with DynamoDB Locking
Terraform Cloud
Azure Storage Account
Google Cloud Storage
State Locking & Concurrency
Importing Existing Infrastructure (
terraform import
)State Management Commands (Refresh, Move, Remove)
Terraform Modules
Creating Modules
Using Modules from:
Local Paths
Git repositories
Terraform Registry
Module Best Practices:
Reusable & Extensible Modules
Versioning & Publishing modules
Terraform Workspaces & Environments
Managing Multiple Environments (dev/staging/prod)
Using Terraform Workspaces
Environment-specific configurations and variables
Terraform Provisioners
Remote-exec and Local-exec
File Provisioners
When to Use and Best Practices
Limitations & Alternatives (Packer, Cloud-init, User Data scripts)
Terraform with Popular Cloud Providers
AWS Quickstart
EC2 Instances
VPC & Networking
IAM
RDS
Azure Quickstart
Virtual Machines
Networking & Security groups
Azure SQL Database
GCP Quickstart
Compute Engine VM instances
Cloud Networking
Cloud SQL
DigitalOcean Quickstart
Droplets
VPCs and Firewalls
Advanced Terraform Concepts
Remote Modules
Terraform Cloud & Terraform Enterprise
Remote Runs
State Storage
Collaboration
Terraform Sentinel Policy as Code
- Writing and enforcing policies
Custom Providers (building your own)
CDK for Terraform (CDKTF)
Terraform & CI/CD Pipelines
Terraform in CI/CD (GitHub Actions, GitLab CI, Jenkins)
Automating deployments and managing approvals
Rollbacks and Disaster Recovery scenarios
Terraform Security Best Practices
Securing state files
Least privilege access policies
Security scanning with
tfsec
,checkov
Secrets management (Vault, AWS Secrets Manager, GitHub Secrets)
Common Errors & Troubleshooting
Common Terraform errors and solutions
Debugging Terraform (
TF_LOG
, Verbose Mode)Handling state conflicts and corruption
Recovery from failed deployments
Terraform Cheat Sheet (Quick Reference)
HCL syntax quick reference
Commonly-used built-in functions
Terraform environment variables
Terraform best-practice snippets
Real-World Project Example
Complete production-ready project using AWS:
VPC, Subnets, Security Groups
EC2 with Auto Scaling & Load Balancing
Managed Databases (RDS)
Remote state management
CI/CD integration (GitLab CI example)
Bonus
Introduction
What is Terraform?
Terraform is an open-source infrastructure as code (IaC) software tool created by HashiCorp. It enables developers, system administrators, and DevOps engineers to safely and predictably create, change, and manage infrastructure across various cloud providers (AWS, Azure, Google Cloud, DigitalOcean, etc.) as well as on-premises resources.
Instead of manually configuring and managing your servers, databases, networks, and storage, Terraform lets you define everything in simple, readable configuration files.
Terraform's main components:
Providers: Allow Terraform to interact with external APIs (AWS, Azure, Google Cloud, Kubernetes, etc.).
Resources: Individual infrastructure objects like servers, networks, storage buckets, databases, etc.
State: A record of your current infrastructure managed by Terraform.
Why use Infrastructure as Code (IaC)?
Infrastructure as Code (IaC) is a method for managing and provisioning infrastructure using code, rather than manual processes.
Key advantages of IaC:
Consistency & Reproducibility: Infrastructure can be consistently reproduced across environments (dev, staging, prod).
Automation: Reduces manual errors by automating deployment and management.
Documentation: Infrastructure code acts as clear documentation of the current state.
Version Control: Infrastructure changes can be reviewed, approved, and versioned.
Collaboration: Multiple team members can safely collaborate and track infrastructure changes.
Why Terraform?
Terraform has become a widely adopted IaC tool for several reasons:
Declarative: Clearly define desired state, and Terraform figures out how to achieve it.
Cloud-Agnostic: One tool to manage multiple cloud providers.
Extensible: Supports a huge variety of services via plugins called Providers.
Strong Community: Widely supported with active development and community resources.
State Management: Robust management of existing infrastructure through state files.
Terraform vs. Other IaC Tools
Here's a brief comparison between Terraform and other popular IaC tools:
Feature | Terraform | AWS CloudFormation | Pulumi | Ansible |
Configuration Style | Declarative (HCL) | Declarative (JSON/YAML) | Imperative (Languages: JS, Python, Go, C#) | Declarative/Procedural (YAML) |
Cloud Agnostic | Yes | No (AWS-specific) | Yes | Yes |
State Management | Built-in state file | AWS Managed | Built-in | Limited state management |
Learning Curve | Moderate | Moderate | Moderate (if familiar with coding languages) | Moderate to High |
Community & Ecosystem | Large, active community | Large AWS-specific community | Growing, developer-centric | Extensive (Ops-focused) |
Use cases | Cloud & On-prem Infrastructure | AWS Infrastructure only | Multi-cloud, cloud-native apps | Configuration & Automation |
Terraform is ideal for multi-cloud scenarios, predictable infrastructure, and declarative style.
CloudFormation is AWS-only; best if you're fully AWS-integrated.
Pulumi suits developers comfortable with traditional programming languages.
Ansible is great for configuration management, automation, and orchestration.
Terraform Workflow
Terraform follows a simple workflow:
terraform init → terraform plan → terraform apply → terraform destroy
init
: Initializes Terraform and downloads necessary providers.plan
: Shows the proposed infrastructure changes without applying them.apply
: Executes and applies infrastructure changes.destroy
: Removes previously created infrastructure.
Quick Example (Hello World)
Let's quickly see Terraform in action with a simple example creating an AWS EC2 instance:
# main.tf
provider "aws" {
region = "us-west-2"
}
resource "aws_instance" "example" {
ami = "ami-0c55b159cbfafe1f0" # Amazon Linux 2 AMI
instance_type = "t2.micro"
tags = {
Name = "Terraform-HelloWorld"
}
}
Execute Terraform commands:
terraform init
terraform plan
terraform apply
You’ve now successfully created infrastructure with Terraform!
Prerequisites
Basic knowledge of cloud infrastructure (AWS, Azure, or Google Cloud).
Familiarity with command-line tools.
Understanding of fundamental IT concepts (servers, networking, databases).
What You’ll Gain from this Tutorial
By completing this tutorial, you’ll:
Understand and master Terraform concepts from basics to advanced.
Write clear and effective Terraform configurations.
Manage infrastructure safely and efficiently.
Troubleshoot common Terraform issues.
Learn best practices, tips, and advanced Terraform usage.
Installation & Setup
In this section, you'll install Terraform, set up your development environment, and create your very first Terraform project.
Installing Terraform
Terraform is available for Windows, macOS, and Linux. Follow these easy steps to install Terraform on your preferred OS.
For Windows:
Using Chocolatey (Recommended):
choco install terraform
Manual Install:
Download the Terraform binary for Windows.
Unzip the file.
Move
terraform.exe
to a directory likeC:\terraform
.Add this directory to your System PATH environment variable.
For macOS:
Using Homebrew (Recommended):
brew tap hashicorp/tap brew install hashicorp/tap/terraform
Manual Install:
Download the Terraform binary for macOS.
Unzip the file.
Move the binary into
/usr/local/bin
:mv terraform /usr/local/bin/ chmod +x /usr/local/bin/terraform
For Linux (Ubuntu/Debian):
Using HashiCorp Repository (Recommended):
curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list sudo apt update && sudo apt install terraform
Manual Install:
Download the Terraform binary for Linux.
Unzip and move the binary into
/usr/local/bin
:unzip terraform*.zip sudo mv terraform /usr/local/bin/ sudo chmod +x /usr/local/bin/terraform
Verify Installation
Check if Terraform is correctly installed by running:
terraform -version
You should see an output similar to:
Terraform v1.8.4
on darwin_amd64
IDE Setup & Tools (VSCode)
Using an IDE such as VSCode greatly enhances your Terraform workflow.
Setup VSCode with Terraform Extension:
Install Visual Studio Code
Open VSCode → Go to the Extensions tab (
Ctrl+Shift+X
orCmd+Shift+X
)Install the official Terraform extension by HashiCorp.
Recommended VSCode extensions:
Terraform (by HashiCorp) – Syntax highlighting, auto-completion, linting.
HashiCorp Configuration Language (HCL) – Syntax highlighting and snippets.
Creating Your First Terraform Project
Let's create a simple Terraform project to understand the basic workflow clearly.
Step 1: Create Project Directory
mkdir terraform-project
cd terraform-project
Step 2: Create your first Terraform file (main.tf
):
# main.tf
terraform {
required_providers {
random = {
source = "hashicorp/random"
version = "~> 3.5.1"
}
}
}
provider "random" {}
resource "random_pet" "name" {
length = 3
separator = "-"
}
output "pet_name" {
value = random_pet.name.id
}
This simple configuration creates a random pet name.
Initialize Terraform Project
Now initialize your project directory to download necessary plugins and providers:
terraform init
Sample Output:
Terraform has been successfully initialized!
Plan & Preview Changes
The terraform plan
command lets you preview your infrastructure before applying changes:
terraform plan
Sample output snippet:
Plan: 1 to add, 0 to change, 0 to destroy.
Apply Changes
To apply your infrastructure changes:
terraform apply
Terraform will prompt for confirmation; type yes
to proceed:
random_pet.name: Creating...
random_pet.name: Creation complete after 0s [id=amazing-purple-butterfly]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
Outputs:
pet_name = "amazing-purple-butterfly"
Congratulations. You just created your first Terraform-managed resource.
Destroy Infrastructure
When done experimenting, you can remove your resource easily:
terraform destroy
Again, Terraform will ask for confirmation (yes
) before deleting resources.
Common Setup Issues (Troubleshooting):
PATH Issues:
Ensure Terraform binary location is added correctly to PATH environment variables.Permissions Issues:
On Linux/macOS, ensure your binary has executable permissions:chmod +x /usr/local/bin/terraform
Recommended Folder Structure:
A clear and maintainable structure for Terraform projects:
terraform-project/
├── modules/
│ └── your-module/
│ └── main.tf
├── environments/
│ ├── dev/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── outputs.tf
│ └── prod/
│ ├── main.tf
│ ├── variables.tf
│ └── outputs.tf
└── README.md
modules: Reusable Terraform modules.
environments: Different environment configurations (dev, staging, prod).
README: Documentation & instructions.
Terraform Basics
In this section, you'll master fundamental Terraform concepts: Providers, Resources, Data Sources, Variables, Outputs, and Terraform State.
Providers
Terraform interacts with external services through providers. Providers enable Terraform to manage various types of resources across multiple cloud and on-premises platforms.
Defining Providers
Providers are defined within your configuration files:
provider "aws" {
region = "us-east-1"
}
This example uses the AWS provider and sets the default region.
Popular Providers:
Terraform supports a vast ecosystem of providers, including:
Cloud Providers: AWS, Azure, Google Cloud, DigitalOcean
Infrastructure Services: Kubernetes, Docker, VMware
Monitoring & Logging: Datadog, New Relic, Splunk
Networking: Cloudflare, Cisco
Other SaaS Products: GitHub, PagerDuty, Vault
Check the full list on the Terraform Registry.
Resources
A resource represents an infrastructure object like a VM, database, network component, etc.
Resource Syntax
Resources follow a straightforward structure:
resource "<resource_type>" "<resource_name>" {
<property> = "<value>"
}
Example - AWS EC2 instance:
resource "aws_instance" "web_server" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t2.micro"
tags = {
Name = "MyWebServer"
}
}
Resource Naming Best Practices:
Use descriptive, clear names (
web_server
,database
,load_balancer
)Follow consistent naming conventions (e.g., snake_case)
Data Sources
Data sources fetch and use external data or resources that Terraform didn't create but needs information from.
Data Source Syntax
data "<data_source_type>" "<name>" {
# parameters
}
Example – Getting latest Amazon Linux AMI:
data "aws_ami" "amazon_linux" {
most_recent = true
filter {
name = "name"
values = ["amzn2-ami-hvm-*-x86_64-ebs"]
}
owners = ["amazon"]
}
resource "aws_instance" "web_server" {
ami = data.aws_ami.amazon_linux.id
instance_type = "t2.micro"
}
This dynamically fetches the latest Amazon Linux AMI ID, ensuring you always use the current AMI.
Variables & Outputs
Variables and outputs help to make Terraform configurations reusable, flexible, and informative.
Input Variables
Define customizable inputs with default values:
Syntax:
variable "instance_type" {
description = "EC2 instance type"
type = string
default = "t2.micro"
}
Using the variable:
resource "aws_instance" "web_server" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = var.instance_type
}
Passing variables via CLI:
terraform apply -var="instance_type=t3.medium"
Output Variables
Outputs help you display important information from resources created:
output "public_ip" {
value = aws_instance.web_server.public_ip
description = "The public IP of the web server"
}
Display outputs after applying changes:
terraform apply
Or manually:
terraform output
Terraform State
Terraform keeps track of infrastructure it manages via a state file (terraform.tfstate
).
Why Terraform State?
Tracks current infrastructure state
Maps real-world resources to Terraform resources
Enables Terraform to detect changes and perform updates correctly
Local vs. Remote State:
Aspect | Local State | Remote State (Recommended) |
Location | Local filesystem (terraform.tfstate ) | Remote storage (AWS S3, Azure Blob, Terraform Cloud) |
Collaboration | Limited, single-user | Enables multi-user collaboration |
Security | Lower (risk of exposure/loss) | Higher (secure, versioned, backed-up) |
Remote State Example (AWS S3 Backend):
terraform {
backend "s3" {
bucket = "my-terraform-state-bucket"
key = "terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-lock-table"
encrypt = true
}
}
S3 stores state securely.
DynamoDB locks the state, preventing simultaneous conflicting edits.
Terraform Lifecycle Commands (Quick Reminder):
Command | Action |
terraform init | Initializes project, downloads providers |
terraform plan | Previews changes without applying |
terraform apply | Applies changes to infrastructure |
terraform destroy | Removes infrastructure created by Terraform |
terraform validate | Validates configuration files |
terraform fmt | Formats your Terraform files |
Terraform Resource Lifecycle Management
Terraform lets you control the lifecycle of resources explicitly:
Lifecycle Meta-argument:
create_before_destroy
(ensures new resources exist before destroying old ones)prevent_destroy
(protects critical resources)ignore_changes
(ignores specified changes)
Example usage:
resource "aws_instance" "database" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t3.medium"
lifecycle {
prevent_destroy = true
create_before_destroy = true
ignore_changes = [tags]
}
}
Terraform Best Practices Recap:
Separate environments (dev/prod/staging) clearly.
Remote state management for collaboration.
Use variables and outputs to enhance reusability.
Keep your configurations modular and organized.
Leverage data sources for dynamic information.
Terraform Configuration Syntax (HCL)
Terraform configurations are written using HashiCorp Configuration Language (HCL). In this section, you'll master HCL syntax, expressions, functions, conditionals, locals, and formatting best practices.
HCL Basics and Syntax
HCL files typically have a .tf
extension and consist of configuration blocks defining resources, providers, variables, etc.
Basic Structure of Terraform Files:
block_type "block_label_1" "block_label_2" {
attribute1 = "value1"
attribute2 = "value2"
nested_block {
attribute3 = "value3"
}
}
block_type: Type of block (e.g., resource, provider, module, variable).
block_label: Identifies the specific instance of a block type.
attributes: Key-value pairs providing details.
Example (resource block):
resource "aws_instance" "web" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t3.medium"
tags = {
Name = "WebServer"
}
}
Expressions and Types
Terraform supports various data types, including strings, numbers, booleans, lists, maps, sets, and objects.
Common Data Types:
string = "Hello Terraform!"
number = 42
boolean = true
list = ["us-east-1a", "us-east-1b"]
map = {
Environment = "production"
Owner = "DevOps"
}
object = {
name = "db"
type = "t3.medium"
}
set = toset(["apple", "banana", "orange"])
Variables and Locals
Variables:
Input variables make configurations reusable:
variable "region" {
type = string
description = "AWS Region"
default = "us-east-1"
}
resource "aws_instance" "example" {
ami = "ami-12345678"
instance_type = "t2.micro"
availability_zone = "${var.region}a"
}
Local Variables (Locals):
Use locals to simplify complex expressions and reuse logic:
locals {
env_name = "prod"
common_tags = {
Environment = local.env_name
ManagedBy = "Terraform"
}
}
resource "aws_instance" "server" {
ami = "ami-12345678"
instance_type = "t2.medium"
tags = local.common_tags
}
Conditionals
Terraform conditionals allow dynamic decisions based on variables.
Conditional Expression Syntax:
condition ? true_value : false_value
Example:
variable "is_production" {
default = false
}
resource "aws_instance" "web_server" {
ami = var.is_production ? "ami-prod123" : "ami-dev456"
instance_type = "t3.micro"
}
Terraform Functions (Commonly Used)
Terraform provides built-in functions to simplify configuration tasks.
String Functions:
upper("hello") # "HELLO"
lower("WORLD") # "world"
format("Hello, %s!", "Terraform") # "Hello, Terraform!"
Collection Functions:
length(["a", "b", "c"]) # 3
contains(["a", "b"], "b") # true
merge({a=1}, {b=2}) # {a=1,b=2}
Numeric Functions:
min(10, 5, 3) # 3
max(10, 5, 3) # 10
ceil(4.1) # 5
floor(4.9) # 4
Encoding Functions:
jsonencode({name="tf"}) # {"name":"tf"}
jsondecode("{\"key\":\"value\"}") # {key="value"}
Terraform Formatting (terraform fmt
)
Terraform provides built-in formatting to maintain consistency:
Formatting Command:
terraform fmt
Automatically formats
.tf
files consistently.Helps maintain readability and clean git diffs.
Recommended to run before commits.
Comments & Documentation
Use comments to document Terraform code clearly:
Single-line Comment:
# Single-line comment describing the next resource
resource "aws_instance" "example" {
ami = "ami-12345678"
instance_type = "t3.micro"
}
Multi-line Comment:
/*
This is a multi-line comment.
Great for detailed explanations.
*/
resource "aws_instance" "example" {
ami = "ami-12345678"
instance_type = "t3.micro"
}
HCL Best Practices & Tips
Avoid Hardcoding Values: Prefer variables, locals, or data sources.
Use Locals for Repetition: Centralize repeated logic.
Utilize Built-in Functions: Simplify logic with built-in Terraform functions.
Comment and Document: Clearly explain configuration decisions.
Run terraform fmt Regularly: Enforce readability and consistent style.
Common Mistakes in Terraform Syntax:
Unquoted Strings: Attributes expect strings in quotes (
instance_type = "t2.micro"
).Missing Commas in Collections: Lists/maps should have commas (
["a", "b", "c"]
).Incorrect Variable References: Use
var.variable_name
syntax consistently.Syntax Errors in Conditionals: Proper format
condition ? true_value : false_value
.
Terraform State Management
Managing Terraform's state correctly is critical to safely and efficiently maintaining your infrastructure. In this section, you'll master state management, including local and remote state, state locking, state import/export, and troubleshooting common state-related issues.
What is Terraform State?
Terraform State (terraform.tfstate
) is a JSON file Terraform uses to track and manage the resources it provisions.
Purpose of Terraform State:
Keeps track of resources managed by Terraform.
Maps real-world resources to Terraform configuration.
Enables incremental updates, change detection, and resource lifecycle management.
Never edit state files manually. Instead, use Terraform CLI commands to interact with state.
Local vs Remote State
Terraform supports two main ways of managing state:
Type of State | Description | Recommended Usage |
Local | Stored locally (terraform.tfstate ) | Small, personal projects |
Remote | Stored in remote backends (S3, Azure, Terraform Cloud) | Production, team collaboration |
Local State (Default)
By default, Terraform stores the state locally in your project directory.
Advantages and Disadvantages:
Easy setup, suitable for quick tests.
Not suitable for teams or production: risk of losing or exposing state.
Example: Default local state setup (no explicit backend required):
terraform apply
This creates a local terraform.tfstate
file.
Remote State (Recommended)
Remote state provides secure, shared storage accessible by multiple users.
Popular Remote State Backends:
AWS S3 with DynamoDB Locking
Terraform Cloud
Azure Storage
Google Cloud Storage
Example: AWS S3 Backend with DynamoDB
Secure, scalable state storage with locking capability.
Step-by-Step Setup:
1. Create S3 bucket and DynamoDB table (via AWS CLI):
aws s3 mb s3://my-terraform-state-bucket
aws dynamodb create-table \
--table-name terraform-lock-table \
--attribute-definitions AttributeName=LockID,AttributeType=S \
--key-schema AttributeName=LockID,KeyType=HASH \
--billing-mode PAY_PER_REQUEST
2. Configure Terraform backend:
terraform {
backend "s3" {
bucket = "my-terraform-state-bucket"
key = "prod/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-lock-table"
encrypt = true
}
}
bucket
: S3 bucket storing the state file.key
: Path to store the state file in the bucket.dynamodb_table
: Ensures state locking to prevent concurrent modifications.encrypt
: Secure state file encryption (recommended).
Initialize the backend:
terraform init
State Locking (Concurrency Management)
Terraform uses state locking to prevent concurrent modifications that could corrupt your state.
Local State: Locks state during operations automatically.
Remote State: Uses locking mechanisms provided by backends (DynamoDB, Azure Storage, Terraform Cloud).
If a lock occurs, you’ll see messages like:
Error acquiring the state lock
Resolving State Locks:
Wait for current operations to finish.
Force unlock (only if you're sure):
terraform force-unlock LOCK_ID
Importing Existing Infrastructure
Bring existing, manually-created infrastructure under Terraform management using terraform import
.
Example: Import an existing AWS EC2 instance:
1. Define the resource in Terraform first (main.tf
):
resource "aws_instance" "existing_web_server" {
ami = "ami-12345678"
instance_type = "t2.micro"
}
2. Import resource using instance ID:
terraform import aws_instance.existing_web_server i-0abcd1234ef5678gh
Terraform updates state with resource info. Run terraform plan
to align configuration with actual resource properties.
Moving and Removing State Resources
Manage state with Terraform state commands:
- Move a resource:
terraform state mv aws_instance.old_name aws_instance.new_name
- Remove resource from state (without destroying):
terraform state rm aws_instance.resource_name
State Refreshing & Synchronization
Sync Terraform state with real-world resources:
terraform refresh
Updates local state file with actual infrastructure state.
Useful if manual infrastructure changes occurred.
Note:
terraform refresh
is deprecated in newer Terraform versions. Useterraform apply -refresh-only
instead.
terraform apply -refresh-only
Handling State Corruption & Recovery
If state files get corrupted:
Use backups: Always store backup copies (automatic with remote backends).
State Recovery using backups:
terraform state pull > backup.tfstate terraform state push backup.tfstate
Manual state inspection (carefully):
terraform state list terraform state show RESOURCE_NAME
Terraform State Best Practices (Cheat Sheet):
Always use Remote State for teams/production environments.
Encrypt state files and restrict access.
Implement versioning and backups for remote state (e.g., S3 bucket versioning).
Use state locking to prevent concurrent modification issues.
Never manually edit state files; always use Terraform CLI commands.
Regularly run
terraform apply -refresh-only
for synchronization.
Troubleshooting Common State Issues:
Locked State:
terraform force-unlock <LOCK_ID>
State Conflicts:
- Refresh state:
terraform apply -refresh-only
- Refresh state:
Import Failures:
- Check resource definitions carefully and retry import.
Terraform Modules
Terraform modules allow you to encapsulate, reuse, and share infrastructure components easily. In this section, you'll master creating, using, publishing, and managing Terraform modules.
What is a Terraform Module?
A module is a reusable, self-contained Terraform configuration defining a logical component or service (e.g., VPC, database cluster, Kubernetes cluster).
Advantages of Using Modules:
Reusable: Write once, reuse many times.
Maintainable: Encapsulate complexity.
Scalable: Facilitate multi-environment configurations.
Collaboration: Share across teams or community.
Creating Terraform Modules
Terraform modules have the following structure:
module_name/
├── main.tf
├── variables.tf
├── outputs.tf
├── versions.tf (optional)
└── README.md
Example Module: Simple AWS EC2 instance module
main.tf
resource "aws_instance" "web" {
ami = var.ami_id
instance_type = var.instance_type
tags = var.tags
}
variables.tf
variable "ami_id" {
type = string
description = "AMI ID for EC2 instance"
}
variable "instance_type" {
type = string
default = "t3.micro"
}
variable "tags" {
type = map(string)
default = {}
}
outputs.tf
output "instance_id" {
value = aws_instance.web.id
description = "ID of the EC2 instance"
}
output "public_ip" {
value = aws_instance.web.public_ip
description = "Public IP of the EC2 instance"
}
Using Modules
Modules can be sourced from:
Local directories
Terraform Registry
Git repositories
Using Local Modules:
module "my_ec2_instance" {
source = "../modules/ec2"
ami_id = "ami-12345678"
instance_type = "t3.medium"
tags = {
Environment = "dev"
Team = "Backend"
}
}
Using Modules from Terraform Registry:
Terraform Registry hosts community and official modules:
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 5.0"
name = "my-vpc"
cidr = "10.0.0.0/16"
enable_dns_hostnames = true
tags = {
Terraform = "true"
Environment = "prod"
}
}
Using Git-based Modules:
module "app_module" {
source = "git::https://github.com/myorg/my-terraform-module.git?ref=v1.2.0"
parameter = "value"
}
Publishing Your Module
To publish your module publicly:
Host your module on GitHub or GitLab.
Follow Terraform Registry guidelines.
Create version tags (
v1.0.0
,v1.1.0
).
Once published, anyone can use your module directly via Terraform Registry or Git URL.
Best Practices for Terraform Modules
Clear README: Document inputs, outputs, and usage examples.
Versioning: Follow semantic versioning (
v1.0.0
,v2.0.0
).Consistency: Use standard file layout (
main.tf
,variables.tf
,outputs.tf
).Granularity: Avoid overly complex modules; prefer composable, focused modules.
Flexible and Sensible Defaults: Provide sensible default values while allowing overrides.
Testing and Validation: Regularly test modules for reliability (Terratest).
Testing Modules (Terratest)
Terratest provides automated testing for Terraform modules using Go:
Example Terratest test (test/main_test.go
):
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/stretchr/testify/assert"
)
func TestTerraformModule(t *testing.T) {
terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
TerraformDir: "../",
})
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
instanceID := terraform.Output(t, terraformOptions, "instance_id")
publicIP := terraform.Output(t, terraformOptions, "public_ip")
assert.NotEmpty(t, instanceID)
assert.NotEmpty(t, publicIP)
}
Run tests:
go test -v ./test
Common Module Pitfalls
Not versioning modules: Always version modules explicitly.
Complex modules: Simplify modules into smaller, focused pieces.
Poor documentation: Clearly document inputs, outputs, examples.
Ignoring testing: Regularly test and validate modules to ensure reliability.
📖 Quick Reference (Cheat Sheet):
Operation | Command/Usage |
Create local module | module "name" { source = "../module" } |
Use Terraform Registry | source = "user/module/provider" |
Use Git Module | source = "git::https://git-url?ref=tag" |
Versioning | Tag versions (v1.0.0 ) in Git |
Test Modules (Terratest) | Write tests in Go, run via go test |
List module outputs | terraform output |
Terraform Workspaces & Environments
Managing multiple environments like development, staging, and production can be streamlined effectively using Terraform workspaces. This section covers workspace creation, switching, and managing environment-specific configurations to build scalable and maintainable infrastructure.
What are Terraform Workspaces?
Terraform workspaces allow you to maintain multiple isolated sets of state within a single configuration, enabling you to manage separate environments (e.g., dev, staging, prod) conveniently.
Key Benefits of Workspaces:
Easily switch between multiple environments.
Maintain clean separation between environment-specific resources.
Avoid state-file clashes.
Simplify infrastructure scaling across environments.
Terraform Workspace Commands
Command | Description |
terraform workspace new <name> | Creates and switches to a new workspace |
terraform workspace select <name> | Switches to an existing workspace |
terraform workspace list | Lists available workspaces |
terraform workspace delete <name> | Deletes a workspace (except "default") |
terraform workspace show | Displays current workspace |
Creating and Switching Workspaces
Create new workspaces for each environment:
Create Workspaces (dev, staging, prod):
terraform workspace new dev
terraform workspace new staging
terraform workspace new prod
Switching workspace:
terraform workspace select staging
How Workspaces Affect Terraform State
Terraform creates separate state files per workspace, stored under:
terraform.tfstate.d/<workspace_name>/terraform.tfstate
Example structure after creating workspaces:
terraform-project/
├── main.tf
├── terraform.tfstate.d
│ ├── dev
│ │ └── terraform.tfstate
│ ├── staging
│ │ └── terraform.tfstate
│ └── prod
│ └── terraform.tfstate
Workspace states are isolated from each other.
Switching workspaces means switching state files automatically.
Using Workspaces in Configuration Files
Make your configuration workspace-aware using the built-in terraform.workspace
variable.
Example: Using workspace for naming resources:
resource "aws_instance" "web" {
ami = "ami-12345678"
instance_type = "t2.micro"
tags = {
Environment = terraform.workspace
Name = "${terraform.workspace}-web-server"
}
}
- Resources automatically adapt based on current workspace (
dev
,staging
, orprod
).
Using Workspace-Specific Variables
You can define variables with different values per workspace.
Example: Workspace-specific variable selection:
variables.tf
variable "instance_type" {
type = map(string)
default = {
dev = "t2.micro"
staging = "t3.small"
prod = "t3.medium"
}
}
main.tf
resource "aws_instance" "web" {
ami = "ami-12345678"
instance_type = var.instance_type[terraform.workspace]
tags = {
Environment = terraform.workspace
Name = "${terraform.workspace}-web-server"
}
}
Workspace Usage Patterns
Recommended patterns:
Single Configuration with Multiple Workspaces: Ideal for smaller setups.
Separate Directories for Environments (without using Terraform workspaces): Ideal for very large, distinct environments.
Recommended for simplicity:
Use workspaces for small to medium-sized projects with similar infrastructure across environments.
Separate directories/projects when infrastructure varies significantly.
Best Practices for Terraform Workspaces
Clearly name workspaces (
dev
,prod
,staging
).Avoid complicated conditional logic based solely on workspaces.
Use workspace state cautiously; consider remote backends for enhanced security.
Keep environment-specific differences minimal; use variables/locals.
Document your workspace setup clearly.
Common Pitfalls with Workspaces
Workspace Misuse: Overusing workspaces when separate directories may be simpler.
Too much conditional logic: Complex conditions make configurations hard to manage.
State Confusion: Ensure clarity about which workspace you're using before applying changes.
Quick Reference Cheat Sheet:
Task | Command / Syntax |
Create workspace | terraform workspace new <env> |
Switch workspace | terraform workspace select <env> |
List workspaces | terraform workspace list |
Delete workspace | terraform workspace delete <env> |
Current workspace | terraform workspace show |
Use workspace in config | ${terraform.workspace} |
Access workspace-specific variables | var.variable_name[terraform.workspace] |
Workspace state path | terraform.tfstate.d/<workspace>/terraform.tfstate |
Practical Workspace Workflow Example:
Typical Workflow:
# Switch to dev workspace
terraform workspace select dev
terraform plan
terraform apply
# Switch to staging workspace
terraform workspace select staging
terraform plan
terraform apply
# Switch to prod workspace
terraform workspace select prod
terraform plan
terraform apply
Real-World Example (Complete Usage):
variables.tf
variable "ami" {
default = "ami-12345678"
}
variable "instance_sizes" {
default = {
dev = "t2.micro"
staging = "t3.small"
prod = "t3.large"
}
}
main.tf
provider "aws" {
region = "us-east-1"
}
resource "aws_instance" "app" {
ami = var.ami
instance_type = var.instance_sizes[terraform.workspace]
tags = {
Environment = terraform.workspace
Name = "${terraform.workspace}-app-server"
}
}
Run this clearly across multiple environments without any hassle.
Terraform CLI Commands Cheat Sheet
Quickly find and reference the most essential Terraform commands for daily use, troubleshooting, and smooth workflow.
Initialization & Setup Commands
Command | Description |
terraform init | Initialize the working directory (downloads providers/modules). |
terraform version | Display Terraform's installed version. |
terraform providers | List currently used providers. |
terraform providers mirror <dir> | Mirror provider plugins locally for offline usage. |
Example:
terraform init
Planning & Applying Changes
Command | Description |
terraform plan | Preview changes without applying. |
terraform plan -out=plan.tfplan | Save plan to file for later apply. |
terraform apply | Apply changes to infrastructure. |
terraform apply plan.tfplan | Apply a saved plan file. |
terraform destroy | Remove resources managed by Terraform. |
terraform refresh | |
terraform apply -refresh-only | Update Terraform state with real-world resources (use apply -refresh-only in latest versions). |
Example:
terraform plan -out=infra-plan
terraform apply infra-plan
Workspace Management
Command | Description |
terraform workspace new <name> | Create & switch to a new workspace. |
terraform workspace select <name> | Switch to an existing workspace. |
terraform workspace list | List available workspaces. |
terraform workspace show | Display current workspace. |
terraform workspace delete <name> | Delete workspace (except default). |
Example:
terraform workspace new staging
terraform workspace select prod
State Management Commands
Command | Description |
terraform state list | List all resources tracked by state. |
terraform state show <resource> | Show attributes of a resource from state. |
terraform state mv <old> <new> | Move resources within state. |
terraform state rm <resource> | Remove resource from state without deleting actual infrastructure. |
terraform state pull | Retrieve remote state locally. |
terraform state push <statefile> | Upload local state to remote backend. |
Example:
terraform state mv aws_instance.old aws_instance.new
Importing & Outputs
Command | Description |
terraform import <resource> <id> | Import existing resource into Terraform. |
terraform output | Display output values from state. |
terraform output <output_name> | Display specific output. |
terraform output -json | Output values in JSON format. |
Example:
terraform import aws_instance.myserver i-1234567890abcdef0
terraform output instance_ip
Validation & Formatting
Command | Description |
terraform validate | Validate syntax of Terraform files. |
terraform fmt | Format Terraform files (.tf files). |
terraform fmt -recursive | Recursively format Terraform files in directories. |
Example:
terraform validate
terraform fmt -recursive
Debugging & Logging
Terraform provides environment variables for detailed logs:
Variable | Description |
export TF_LOG=TRACE | Enable detailed debug logging. |
export TF_LOG_PATH=terraform.log | Output logs to a specific file. |
Example:
export TF_LOG=DEBUG
terraform plan
Terraform Cloud Commands
Command | Description |
terraform login | Log in to Terraform Cloud. |
terraform logout | Log out from Terraform Cloud. |
Environment Variables (Common)
Set these to simplify configuration and authentication:
Variable | Description |
AWS_ACCESS_KEY_ID | AWS Access Key ID |
AWS_SECRET_ACCESS_KEY | AWS Secret Key |
AWS_DEFAULT_REGION | AWS default region |
GOOGLE_CREDENTIALS | GCP credentials (JSON) |
ARM_CLIENT_ID , ARM_CLIENT_SECRET | Azure credentials |
Example:
export AWS_ACCESS_KEY_ID=your_key
export AWS_SECRET_ACCESS_KEY=your_secret
export AWS_DEFAULT_REGION=us-east-1
Command-line Flags (Common)
Useful flags to enhance Terraform command usage:
Flag | Description |
-auto-approve | Skip interactive approval (terraform apply -auto-approve ). |
-var | Set variable directly from CLI. |
-var-file | Load variables from a .tfvars file. |
-input=false | Disable interactive prompts. |
-target=resource | Apply/plan specific resource. |
Example:
terraform apply -auto-approve -var-file=prod.tfvars
terraform destroy -target=aws_instance.myserver
Quick Workflow Example (Daily Usage)
Here's a typical daily workflow snippet:
# Initialize project
terraform init
# Check current workspace
terraform workspace show
# Plan changes
terraform plan -out=planfile
# Apply changes
terraform apply planfile
# Verify outputs
terraform output
Common CLI Errors & Troubleshooting:
"State Lock Error":
terraform force-unlock LOCK_ID
(Use carefully!)"Provider missing":
terraform init
(ensure proper network connection)"Syntax validation failed":
Runterraform validate
and fix issues reported."Conflicts between state and actual resources":
Runterraform apply -refresh-only
.
Terraform Provisioners
Provisioners in Terraform allow you to execute scripts or commands locally or remotely on resources during creation or destruction. This section covers how to use provisioners effectively, clearly explains their limitations, and offers best practices.
What are Terraform Provisioners?
Provisioners enable you to run scripts or commands directly on provisioned resources or locally on your machine to automate configuration tasks, initialization, or cleanup.
Common use cases:
Initializing virtual machines (installing software, packages, dependencies).
Uploading files to instances.
Running configuration scripts post-deployment.
Types of Provisioners
Terraform provides three main types of provisioners:
Provisioner | Description | Typical Use-case |
remote-exec | Executes commands/scripts on remote instances via SSH or WinRM. | Software installations, updates |
local-exec | Executes commands/scripts locally on the Terraform host. | Notifications, local script triggers |
file | Transfers files/directories from local host to remote instances. | Uploading configuration or data |
Using Remote-Exec Provisioner
Executes commands on remote resource after creation:
Syntax Example:
resource "aws_instance" "web" {
ami = "ami-0abcdef1234567890"
instance_type = "t2.micro"
key_name = "my_key"
provisioner "remote-exec" {
inline = [
"sudo apt update -y",
"sudo apt install -y nginx",
"echo 'Hello Terraform' | sudo tee /var/www/html/index.html",
]
connection {
type = "ssh"
user = "ubuntu"
private_key = file("~/.ssh/my_key.pem")
host = self.public_ip
}
}
}
What Happens:
Instance is created.
Connects via SSH and runs provided commands.
Using Local-Exec Provisioner
Executes commands locally after resource creation.
Syntax Example:
resource "aws_instance" "web" {
ami = "ami-0abcdef1234567890"
instance_type = "t2.micro"
provisioner "local-exec" {
command = "echo Instance created with IP: ${self.public_ip} > instance_info.txt"
}
}
What Happens:
Creates instance.
Saves the public IP address into a local file (
instance_info.txt
).
Using File Provisioner
Transfers files from local host to remote resources.
Syntax Example:
resource "aws_instance" "web" {
ami = "ami-0abcdef1234567890"
instance_type = "t2.micro"
key_name = "my_key"
provisioner "file" {
source = "config/nginx.conf"
destination = "/tmp/nginx.conf"
connection {
type = "ssh"
user = "ubuntu"
private_key = file("~/.ssh/my_key.pem")
host = self.public_ip
}
}
provisioner "remote-exec" {
inline = [
"sudo mv /tmp/nginx.conf /etc/nginx/nginx.conf",
"sudo systemctl restart nginx",
]
connection {
type = "ssh"
user = "ubuntu"
private_key = file("~/.ssh/my_key.pem")
host = self.public_ip
}
}
}
What Happens:
Uploads local
nginx.conf
file to remote server.Moves file to correct location and restarts NGINX service.
Provisioner Lifecycle and Triggers
By default, provisioners run during resource creation.
To run on destruction (
terraform destroy
), specify:
provisioner "local-exec" {
when = destroy
command = "echo Instance destroyed! > destroy.log"
}
Limitations and Best Practices
Provisioners have certain limitations and should be used carefully:
Best Practices:
Minimize use: Prefer built-in cloud-init, user data scripts, or configuration tools like Ansible, Puppet, Chef.
Idempotency: Scripts should handle being run multiple times safely.
Error handling: Provisioners failing cause Terraform to halt. Write robust scripts.
Sensitive data: Avoid passing secrets via provisioners directly.
Limitations:
Not suitable for complex configuration tasks.
Limited error recovery.
Provisioners aren’t tracked after initial execution; subsequent updates require resource recreation or external tools.
Alternatives to Provisioners (Recommended)
For complex or ongoing configurations, use alternatives:
Cloud-init or User Data scripts: Lightweight initialization scripts at instance launch.
Packer: Pre-built AMIs or VM images.
Configuration Management Tools: Ansible, Chef, Puppet, SaltStack.
Practical Workflow Example
Simple, real-world example combining provisioners:
resource "aws_instance" "app_server" {
ami = "ami-0abcdef1234567890"
instance_type = "t3.micro"
key_name = "my_key"
provisioner "file" {
source = "setup_app.sh"
destination = "/tmp/setup_app.sh"
connection {
type = "ssh"
user = "ubuntu"
private_key = file("~/.ssh/my_key.pem")
host = self.public_ip
}
}
provisioner "remote-exec" {
inline = [
"chmod +x /tmp/setup_app.sh",
"sudo /tmp/setup_app.sh",
]
connection {
type = "ssh"
user = "ubuntu"
private_key = file("~/.ssh/my_key.pem")
host = self.public_ip
}
}
provisioner "local-exec" {
command = "echo App server deployed at ${self.public_ip} >> deploy.log"
}
}
Quick Reference Cheat Sheet:
Provisioner | Use Case | Example |
remote-exec | Remote commands (SSH/WinRM) | Installing software remotely |
local-exec | Commands locally | Logging, notifications |
file | Transfer files to instance | Uploading configs, binaries |
Run on destroy | when = destroy | Cleanup tasks upon resource removal |
Common Provisioner Errors and Troubleshooting:
SSH Connection Issues:
Check key permissions (
chmod 400 my_key.pem
)Ensure correct user (
ubuntu
,ec2-user
, etc.)Security group allowing SSH (port 22)
Provisioner Timeout:
- Increase timeout in connection block (
timeout = "5m"
).
- Increase timeout in connection block (
Scripts Fail:
- Ensure scripts are idempotent and tested manually before running with Terraform.
Terraform with Popular Cloud Providers
Terraform excels at managing infrastructure across multiple cloud platforms. In this section, you'll learn how to quickly set up essential resources on AWS, Azure, Google Cloud, and DigitalOcean.
AWS with Terraform
Provider Setup:
provider "aws" {
region = "us-east-1"
}
Set AWS credentials via environment variables:
export AWS_ACCESS_KEY_ID=your_key
export AWS_SECRET_ACCESS_KEY=your_secret
export AWS_DEFAULT_REGION=us-east-1
Create an EC2 Instance:
resource "aws_instance" "example" {
ami = "ami-0c55b159cbfafe1f0" # Amazon Linux 2
instance_type = "t3.micro"
tags = {
Name = "TerraformExample"
}
}
Azure with Terraform
Provider Setup:
provider "azurerm" {
features {}
}
Set Azure credentials via environment variables:
export ARM_CLIENT_ID=your_client_id
export ARM_CLIENT_SECRET=your_secret
export ARM_SUBSCRIPTION_ID=your_subscription_id
export ARM_TENANT_ID=your_tenant_id
Create Azure VM:
resource "azurerm_resource_group" "example" {
name = "rg-terraform"
location = "East US"
}
resource "azurerm_virtual_network" "example" {
name = "vnet-terraform"
location = azurerm_resource_group.example.location
resource_group_name = azurerm_resource_group.example.name
address_space = ["10.0.0.0/16"]
}
resource "azurerm_subnet" "example" {
name = "subnet1"
resource_group_name = azurerm_resource_group.example.name
virtual_network_name = azurerm_virtual_network.example.name
address_prefixes = ["10.0.2.0/24"]
}
resource "azurerm_network_interface" "example" {
name = "nic-terraform"
location = azurerm_resource_group.example.location
resource_group_name = azurerm_resource_group.example.name
ip_configuration {
name = "ipconfig1"
subnet_id = azurerm_subnet.example.id
private_ip_address_allocation = "Dynamic"
}
}
resource "azurerm_linux_virtual_machine" "example" {
name = "vm-terraform"
resource_group_name = azurerm_resource_group.example.name
location = azurerm_resource_group.example.location
size = "Standard_B1s"
admin_username = "azureuser"
admin_password = "ComplexPassw0rd!"
network_interface_ids = [
azurerm_network_interface.example.id,
]
os_disk {
caching = "ReadWrite"
storage_account_type = "Standard_LRS"
}
source_image_reference {
publisher = "Canonical"
offer = "UbuntuServer"
sku = "18.04-LTS"
version = "latest"
}
}
Google Cloud Platform (GCP) with Terraform
Provider Setup:
provider "google" {
credentials = file("path/to/credentials.json")
project = "my-gcp-project"
region = "us-central1"
}
Set credentials via environment variable:
export GOOGLE_CREDENTIALS=$(cat path/to/credentials.json)
Create GCP Compute Engine VM:
resource "google_compute_instance" "example" {
name = "terraform-vm"
machine_type = "f1-micro"
zone = "us-central1-a"
boot_disk {
initialize_params {
image = "debian-cloud/debian-11"
}
}
network_interface {
network = "default"
access_config {}
}
}
DigitalOcean with Terraform
Provider Setup:
provider "digitalocean" {
token = var.do_token
}
Set DigitalOcean API token via environment variables:
export DIGITALOCEAN_TOKEN=your_token_here
Create a DigitalOcean Droplet:
resource "digitalocean_droplet" "example" {
name = "terraform-droplet"
region = "nyc3"
size = "s-1vcpu-1gb"
image = "ubuntu-22-04-x64"
ssh_keys = [
"your-ssh-key-fingerprint"
]
}
Quick Reference Cheat Sheet (Cloud Providers):
Provider | Common Resources | Terraform Registry |
AWS | EC2, S3, RDS, IAM, Lambda, VPC | Terraform AWS Provider |
Azure | VM, Storage, SQL Database, App Service | Terraform Azure Provider |
GCP | Compute Engine, Storage, Cloud SQL | Terraform GCP Provider |
DigitalOcean | Droplets, Spaces, Databases | Terraform DigitalOcean Provider |
Best Practices for Multi-Cloud Terraform
Separate projects per cloud provider clearly.
Use modules to maintain reusable components.
Leverage provider-specific data sources to dynamically fetch data.
Store sensitive credentials securely, ideally via environment variables or secrets management tools.
Troubleshooting Common Issues
Provider authentication errors:
Ensure credentials are correctly set in environment variables.
Validate API access permissions.
Instance creation failures:
Confirm region availability for resource types.
Check quotas or resource limits in the cloud provider’s dashboard.
Real-world Project Structure Example
A robust structure for multi-cloud Terraform projects:
terraform-multicloud/
├── aws
│ ├── main.tf
│ └── variables.tf
├── azure
│ ├── main.tf
│ └── variables.tf
├── gcp
│ ├── main.tf
│ └── variables.tf
├── digitalocean
│ ├── main.tf
│ └── variables.tf
└── modules
├── aws_ec2
├── azure_vm
├── gcp_compute
└── digitalocean_droplet
Advanced Terraform Concepts
In this section, you'll explore powerful Terraform features and strategies to scale and secure your infrastructure in real-world environments, including:
Advanced modules
Terraform Cloud & Enterprise
Policy as Code with Sentinel
CDK for Terraform (CDKTF)
Dynamic blocks and for-each loops
Custom providers
Advanced Module Design
Modules aren't just reusable — they can be designed for extensibility, scalability, and team collaboration.
Tips for Advanced Modules:
Expose minimal required variables, group optional ones into nested objects.
Use
count
orfor_each
for conditional resources.Accept nested blocks as input using
dynamic
blocks (more below).Include version constraints to avoid breaking changes:
terraform {
required_version = ">= 1.3.0"
}
Dynamic Blocks & for_each
Dynamic blocks allow you to generate repeating configuration blocks based on variables or complex structures.
Example: Create dynamic security group rules
variable "ingress_rules" {
default = [
{ from_port = 80, to_port = 80, protocol = "tcp", cidr = "0.0.0.0/0" },
{ from_port = 443, to_port = 443, protocol = "tcp", cidr = "0.0.0.0/0" }
]
}
resource "aws_security_group" "web_sg" {
name = "web-sg"
dynamic "ingress" {
for_each = var.ingress_rules
content {
from_port = ingress.value.from_port
to_port = ingress.value.to_port
protocol = ingress.value.protocol
cidr_blocks = [ingress.value.cidr]
}
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
Terraform Cloud & Enterprise
Terraform Cloud is a managed service that provides collaboration, state storage, remote runs, and policy enforcement.
Key Features:
Remote Execution: Terraform plans and applies happen in the cloud.
Remote State Storage: Secure, versioned backend.
Variable Management: Environment, sensitive, or team-specific.
Sentinel Policies: Governance and compliance enforcement.
Team Permissions: Role-based access control.
Workflow Example:
terraform login # Authenticate with Terraform Cloud
terraform init # Configure backend in terraform block
terraform plan # Plan runs remotely
terraform apply # Approve plan in Terraform UI or CLI
Policy as Code with Sentinel
Sentinel is HashiCorp’s policy-as-code framework for enforcing rules on infrastructure plans.
Example Use Case:
Disallow creation of public S3 buckets
Enforce tagging standards
Restrict resource types or regions
Example Sentinel Policy:
import "tfplan/v2"
public_buckets = filter tfplan.resource_changes as rc {
rc.type is "aws_s3_bucket" and
rc.change.after.acl is "public-read"
}
main = rule {
length(public_buckets) is 0
}
CDK for Terraform (CDKTF)
The Cloud Development Kit for Terraform (CDKTF) allows you to use familiar programming languages (TypeScript, Python, Go, Java, C#) instead of HCL.
Benefits:
Full power of imperative logic
Reuse NPM/PyPI packages
Strong typing & IntelliSense
CDKTF Workflow:
npm install -g cdktf-cli
cdktf init --template=typescript --local
cdktf synth
cdktf deploy
CDKTF translates your code into standard Terraform JSON behind the scenes.
Custom Terraform Providers (Advanced)
When no provider exists for a system you want to manage, you can build your own.
Use Cases:
Managing internal APIs or tools
Integrating with non-cloud systems
Tools:
Written in Go
Uses Terraform Plugin SDK
Can be distributed via HashiCorp Registry or GitHub
Resources:
Quick Advanced Terraform Concepts Cheat Sheet:
Feature | Purpose |
for_each / count | Create resources conditionally or in loop |
dynamic blocks | Repeat nested blocks programmatically |
Sentinel | Policy-as-code for enterprise governance |
CDKTF | Write Terraform using traditional languages |
Custom providers | Extend Terraform for unsupported APIs |
Terraform Cloud | Remote state, execution, team workflows |
Terraform CI/CD Integration
Integrating Terraform with CI/CD pipelines enables automated, consistent, and safe infrastructure deployment. In this section, you'll learn how to wire Terraform into platforms like GitHub Actions, GitLab CI/CD, and Jenkins.
Why Use CI/CD with Terraform?
Automating Terraform through CI/CD:
Reduces manual errors
Standardizes workflows
Enables approval gates and auditing
Supports GitOps (infra-as-code driven by version control)
Example 1: Terraform with GitHub Actions
Folder Structure:
.
├── .github/
│ └── workflows/
│ └── terraform.yml
├── main.tf
├── variables.tf
└── backend.tf
GitHub Workflow File (.github/workflows/terraform.yml
):
name: Terraform CI
on:
push:
branches: [ "main" ]
pull_request:
jobs:
terraform:
name: Terraform Format, Validate, Plan, and Apply
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: 1.6.0
- name: Terraform Format
run: terraform fmt -check
- name: Terraform Init
run: terraform init
- name: Terraform Validate
run: terraform validate
- name: Terraform Plan
run: terraform plan -out=plan.tfplan
- name: Terraform Apply (auto-approve on push to main)
if: github.ref == 'refs/heads/main'
run: terraform apply -auto-approve plan.tfplan
Secrets Required:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
Store them in GitHub → Settings → Secrets → Actions
Example 2: Terraform with GitLab CI/CD
.gitlab-ci.yml
Example:
stages:
- validate
- plan
- apply
variables:
TF_ROOT: "."
TF_VERSION: "1.6.0"
before_script:
- terraform --version
- cd $TF_ROOT
validate:
stage: validate
image: hashicorp/terraform:$TF_VERSION
script:
- terraform init -backend=false
- terraform validate
plan:
stage: plan
image: hashicorp/terraform:$TF_VERSION
script:
- terraform init
- terraform plan -out=tfplan
artifacts:
paths:
- tfplan
apply:
stage: apply
image: hashicorp/terraform:$TF_VERSION
script:
- terraform apply -auto-approve tfplan
when: manual
only:
- main
GitLab CI/CD Features:
Manual approval before apply
Built-in variable management
Integrated logging and pipeline history
Example 3: Terraform with Jenkins
Jenkins Pipeline Script:
pipeline {
agent any
environment {
AWS_ACCESS_KEY_ID = credentials('aws-access-key')
AWS_SECRET_ACCESS_KEY = credentials('aws-secret-key')
}
stages {
stage('Checkout') {
steps {
git 'https://github.com/your/repo.git'
}
}
stage('Init') {
steps {
sh 'terraform init'
}
}
stage('Validate') {
steps {
sh 'terraform validate'
}
}
stage('Plan') {
steps {
sh 'terraform plan -out=tfplan'
}
}
stage('Apply') {
when {
branch 'main'
}
steps {
sh 'terraform apply -auto-approve tfplan'
}
}
}
}
Notes:
Jenkins credentials store integrates with AWS CLI or Terraform directly.
Consider using Jenkins Terraform plugin for managing versions.
Security & Secrets Management
Use environment variables or secrets vaults (GitHub Secrets, GitLab CI Variables, Jenkins Credentials).
Avoid hardcoding credentials in
.tf
or.yml
files.Prefer service principals or IAM roles when available (e.g., using OIDC for GitHub → AWS).
Best Practices for Terraform in CI/CD
Practice | Why It Matters |
Use separate stages for plan and apply | Enables approvals and visibility |
Store plans as artifacts | Allows reuse and traceability |
Protect main branches | Prevent unapproved changes |
Use Terraform Cloud or remote state | Centralized state & collaboration |
Lint & format on PR | Enforce code consistency |
Run terraform validate early | Catch issues before applying |
Workflow Summary Diagram
Part 13: Terraform Security Best Practices
Security in Terraform isn't just about encrypted state files — it's about controlling access, protecting secrets, minimizing blast radius, and ensuring reproducibility.
This part covers:
Securing state
Managing secrets safely
Least-privilege IAM
Locking down Terraform execution
Auditability and compliance
Tools for security scanning
1. Secure Your State Files
Terraform state files contain sensitive information such as:
Passwords and secrets
Cloud resource metadata
IP addresses and key names
Recommendations:
Never commit
terraform.tfstate
or backups to Git.Use remote backends like:
AWS S3 with server-side encryption (SSE-S3/KMS)
Terraform Cloud
Azure Blob Storage with encryption and locking
Enable versioning in S3/GCS/Azure to roll back corrupt or leaked states.
Use state encryption at rest and in transit.
2. Manage Secrets Securely
Avoid storing credentials in .tf
files, terraform.tfvars
, or plaintext anywhere in version control.
Use:
Environment variables
Secrets Managers (e.g., AWS Secrets Manager, Vault, Doppler)
Remote variable injection via CI/CD pipelines
Use
sensitive = true
on sensitive output variables:
output "db_password" {
value = var.db_password
sensitive = true
}
Terraform will now hide the value in CLI output and plan logs.
3. Use Least-Privilege IAM
Apply the principle of least privilege when creating Terraform’s cloud credentials:
For AWS:
Create separate IAM user or role with minimal permissions.
Deny access to secrets, non-relevant services.
Prefer temporary credentials or role assumption via STS.
For Azure:
Use Service Principals with specific role assignments.
Assign
Contributor
, notOwner
, unless explicitly needed.
For GCP:
- Use Workload Identity Federation or Service Accounts with minimum scopes.
4. Lock Down Terraform Execution
Ensure Terraform apply only runs in trusted environments (e.g., CI/CD or Terraform Cloud).
Restrict apply permissions using branch protection or manual approvals.
Use Sentinel or OPA (Open Policy Agent) to restrict what gets deployed.
Example Policies:
No public S3 buckets
Tag enforcement (owner, environment)
Only use approved regions
5. Enable Audit Logs
Maintain traceability of infrastructure changes.
Audit options:
Terraform Cloud: Built-in run logs & policy enforcement
Git history: Track changes to
.tf
filesRemote backend logs (e.g., CloudTrail for S3 access)
Enable versioning + access logging in backends
6. Use Security Scanners
Automated tools help detect misconfigurations and violations of best practices.
Recommended Tools:
Tool | Purpose | Usage Example |
tfsec | Static analysis of Terraform code | tfsec . |
checkov | Infrastructure-as-code scanning | checkov -d . |
terrascan | Policy-as-code scanning | terrascan scan |
TFLint | Linting and best-practice checking | tflint |
Integrate these tools into your CI/CD workflows for automatic checks on every PR.
7. Secure Module Use
If you’re using or publishing modules:
Use version pinning: avoid unexpected changes
source = "terraform-aws-modules/vpc/aws" version = "~> 5.0"
Audit modules for:
Public exposure (S3, Load Balancers)
Open security groups (0.0.0.0/0)
Hardcoded credentials or default passwords
8. Isolate Environments & State
Create separate state files for each environment (
dev
,staging
,prod
)Avoid sharing variables or backends across environments
Use Terraform workspaces cautiously — prefer isolated directories for critical infra
9. Use terraform plan
for Review
Always review the plan output before applying
In CI/CD: store
plan.tfplan
as an artifactRequire manual approval before apply for sensitive environments
Quick Security Cheat Sheet
Concern | Best Practice |
Secrets | Use env vars / secrets manager / CI/CD vaults |
State | Use remote encrypted backend, restrict access |
IAM | Least privilege, separate Terraform user/role |
Approvals | Require manual approval before apply |
Audit | Git history + remote logs + state versioning |
Validation | tfsec / checkov / terrascan / OPA |
Sensitive Outputs | sensitive = true |
Apply Restrictions | Only in CI/CD or controlled environments |
Module Safety | Pin versions, audit third-party modules |
Common Errors & Troubleshooting Terraform
Even with perfect code, Terraform can throw unexpected errors due to API changes, networking issues, resource drift, or misconfigured state. This part will help you identify, understand, and resolve the most common issues you'll encounter.
1. Initialization Issues (terraform init
)
Error: Provider not found / Failed to install provider
Error: Failed to install provider
│ Could not retrieve the list of available versions for provider...
Fix:
Run
terraform init -upgrade
Ensure internet connection
Validate your provider block:
terraform { required_providers { aws = { source = "hashicorp/aws" version = "~> 5.0" } } }
2. Validation & Syntax Errors (terraform validate
)
Error: Invalid function call or undefined variable
Error: Unsupported attribute
Fix:
Confirm variable exists and is referenced correctly:
var.variable_name
Use
terraform console
to test expressions interactivelyRun
terraform validate
and read line/column numbers carefully
3. Planning Issues (terraform plan
)
Error: Resource depends on uncreated resource
Error: Reference to undeclared resource
Fix:
Ensure the resource you’re referencing actually exists in your config
Use
depends_on
to explicitly enforce ordering if needed:depends_on = [aws_security_group.allow_http]
4. Apply Errors (terraform apply
)
Error: Timeout, API limit, dependency failure
Error: Error waiting for instance (i-abc123) to become ready...
Fix:
Retry:
terraform apply
again after some timeAdd
timeouts
block if you need longer provisioning windowsAvoid applying during cloud provider maintenance windows
Error: User data script or provisioner fails
Error: remote-exec provisioner error
Fix:
SSH into the instance manually and debug (
ping
,cloud-init logs
)Make sure:
SSH port is open in security group
Correct username (
ubuntu
,ec2-user
, etc.)Script is executable and idempotent
5. State Lock Errors
Error: State is locked
Error: Error acquiring the state lock
Fix:
Check if another operation is running
Run force unlock (only when safe):
terraform force-unlock <LOCK_ID>
6. Resource Already Exists
Error: Resource already managed or exists outside Terraform
Error: Resource already exists
Fix:
If it's unmanaged by Terraform, import it:
terraform import aws_instance.web i-1234567890abcdef0
If managed but renamed, use:
terraform state mv old.name new.name
7. Destroy Fails
Error: Resource cannot be destroyed
Error: DependencyViolation
Fix:
Ensure dependent resources are removed first
Check for external dependencies (e.g., manually attached EBS volumes)
Try
terraform destroy -target=resource_type.name
8. Drift Between State and Reality
Terraform plans updates or destroys resources you didn’t change.
Fix:
Run
terraform apply -refresh-only
(recommended in newer versions)Investigate manually made changes outside Terraform
Re-import if needed
9. Troubleshooting Tips & Debug Mode
Use terraform console
terraform console
> var.instance_type
"t3.micro"
Helps validate expressions, variable values, and outputs interactively.
Enable Debug Logs
export TF_LOG=DEBUG
export TF_LOG_PATH=terraform.log
terraform apply
Levels:
TRACE
,DEBUG
,INFO
,WARN
,ERROR
Use log to identify request-response pairs, provider errors, and JSON payloads.
10. Clean Slate & Reset
If things are too messy, reset everything safely:
rm -rf .terraform/ terraform.tfstate* .terraform.lock.hcl
terraform init
Use with caution: make sure you’re not deleting valid, active state.
Troubleshooting Cheat Sheet
Problem | Fix |
Provider not found | terraform init -upgrade , check required_providers |
Resource exists externally | Use terraform import |
Plan shows unexpected destroy | Run terraform refresh or review manual changes |
SSH/provisioner fails | Check username, key, firewall, or remote access |
State is locked | Wait or terraform force-unlock <id> |
Secrets in state | Use sensitive = true , remote encrypted state |
Circular dependency | Use depends_on |
Terraform Quick Reference
A concise summary of Terraform’s syntax, CLI, blocks, patterns, and productivity tips.
CLI Command Quick Reference
Command | Purpose |
terraform init | Initialize working directory |
terraform plan | Preview changes |
terraform apply | Apply changes |
terraform destroy | Destroy all managed resources |
terraform validate | Validate .tf file syntax |
terraform fmt | Format Terraform code |
terraform output | Show output values |
terraform console | Interactive expression testing |
terraform show | Show full state in readable format |
terraform graph | Generate DOT graph of resources |
terraform import <res> <id> | Import existing resource |
terraform taint <res> | Mark a resource for recreation |
terraform workspace | Manage environments (dev/staging/prod) |
terraform state | View/edit state file |
File Structure
File | Purpose |
main.tf | Core configuration |
variables.tf | Input variables |
outputs.tf | Output values |
terraform.tfvars | Variable values (can be ignored in VCS) |
backend.tf | Remote backend config |
.terraform.lock.hcl | Provider version lock file |
Terraform Block Patterns
Resource Block
resource "aws_instance" "web" {
ami = "ami-123"
instance_type = "t2.micro"
tags = {
Name = "WebServer"
}
}
Variable Block
variable "region" {
type = string
default = "us-east-1"
}
Output Block
output "ip" {
value = aws_instance.web.public_ip
}
Locals
locals {
app_name = "my-app"
}
Loops & Conditionals
for_each
resource "aws_s3_bucket" "buckets" {
for_each = toset(["dev", "staging", "prod"])
bucket = "my-bucket-${each.key}"
}
count
resource "aws_instance" "web" {
count = 2
instance_type = "t2.micro"
}
Conditional Expression
instance_type = var.env == "prod" ? "t3.large" : "t3.micro"
Dynamic Block
dynamic "ingress" {
for_each = var.rules
content {
from_port = ingress.value.from
to_port = ingress.value.to
protocol = ingress.value.protocol
cidr_blocks = [ingress.value.cidr]
}
}
Security Best Practices
Task | Best Practice |
Secrets | Use env vars or secret managers |
Sensitive outputs | sensitive = true |
Remote state | Use S3 + DynamoDB or Terraform Cloud |
Least privilege | Apply minimal IAM policies |
Validation | tfsec , checkov , tflint |
Useful Built-in Functions
Type | Function(s) |
String | upper() , lower() , replace() |
Collection | length() , merge() , flatten() |
Numeric | max() , min() , ceil() |
Encoding | jsonencode() , base64encode() |
Misc | lookup() , file() , element() |
Provider Block Examples
provider "aws" {
region = var.region
}
provider "google" {
credentials = file("gcp.json")
project = var.project_id
region = var.region
}
Best Practices Summary
Use modules and version pinning
Isolate environments with workspaces or directories
Never commit state files or secrets
Run
fmt
,validate
,plan
before applyUse
-auto-approve
only in CI/CD or sandbox
Final Workflow Snapshot
terraform init
terraform fmt -recursive
terraform validate
terraform plan -out=plan.tfplan
terraform apply plan.tfplan
terraform output
Real-World Terraform Architecture Examples
This part walks through practical, production-ready Terraform architecture blueprints across various use cases, complete with diagrams, file layouts, and patterns used in real teams.
Example 1: Basic 3-Tier Web App on AWS
Architecture:
Terraform Components:
Component | Resource Type |
VPC | aws_vpc , aws_subnet |
Load Balancer | aws_lb , aws_lb_target_group , aws_lb_listener |
EC2 Instances | aws_instance , aws_launch_template |
Security Groups | aws_security_group |
Database | aws_db_instance |
Folder Layout:
project/
├── main.tf
├── variables.tf
├── outputs.tf
├── modules/
│ ├── vpc/
│ ├── ec2/
│ └── rds/
├── environments/
│ ├── dev/
│ └── prod/
Example 2: Scalable EKS Kubernetes Cluster
Architecture:
VPC with 3 subnets
Managed EKS cluster
Worker node groups
IAM roles and policies
Secrets managed via AWS Secrets Manager
Key Resources:
aws_eks_cluster
,aws_eks_node_group
aws_iam_role
,aws_iam_policy
kubernetes_*
(viakubernetes
provider)aws_secretsmanager_secret
Example 3: Azure Serverless App with Terraform
Architecture:
Azure Resource Group
App Service + App Insights
Azure Storage
Azure Cosmos DB
Resource Types:
azurerm_app_service
azurerm_application_insights
azurerm_cosmosdb_account
azurerm_storage_account
Example 4: GCP Auto-Scaled Compute Instance Group
Setup:
VPC network + subnets
Instance template + MIG (Managed Instance Group)
Load balancer
Health checks
Resource Types:
google_compute_instance_template
google_compute_region_instance_group_manager
google_compute_forwarding_rule
google_compute_health_check
Real-World Practices to Adopt
Practice | Description |
Modularization | Reuse and encapsulate resources |
Multi-env isolation | Separate workspaces or folders for dev/prod |
Secrets separation | Inject from environment or secret managers |
GitOps Flow | Trigger pipelines via PRs and code commits |
Consistent tagging | Tag resources with Environment , Owner , Project |
Terraform + Ansible, Packer, and Docker
Automating Provisioning, Image Building, and Container Orchestration
Terraform is amazing for infrastructure provisioning — but real-world DevOps combines tools. This part shows how to integrate Terraform with Ansible, Packer, and Docker to build a production-ready, fully automated pipeline.
1. Terraform + Ansible (Provisioning + Configuration)
Use Case:
Terraform provisions infrastructure (e.g., EC2, GCP VM, Azure VM).
Ansible configures software on those instances (e.g., installs Docker, deploys apps).
Example Flow:
1. Terraform provisions EC2 with SSH access
2. Terraform outputs public IP
3. Ansible connects via SSH and runs playbooks
Terraform Snippet (to output IP):
output "public_ip" {
value = aws_instance.web.public_ip
}
Ansible Inventory:
[web]
${public_ip} ansible_user=ubuntu ansible_ssh_private_key_file=~/.ssh/mykey.pem
Ansible Command:
ansible-playbook -i inventory.ini playbook.yml
2. Terraform + Packer (Image Building)
Use Case:
Packer builds custom VM images (AMI, GCP Image, Azure image).
Terraform uses the image to launch instances.
Example Workflow:
1. Packer builds AMI with NGINX pre-installed
2. Terraform uses that AMI in `aws_instance`
Packer Template (NGINX AMI):
{
"builders": [
{
"type": "amazon-ebs",
"region": "us-east-1",
"source_ami": "ami-0c55b159cbfafe1f0",
"instance_type": "t2.micro",
"ssh_username": "ubuntu",
"ami_name": "nginx-{{timestamp}}"
}
],
"provisioners": [
{
"type": "shell",
"inline": [
"sudo apt update",
"sudo apt install -y nginx"
]
}
]
}
Terraform Snippet:
variable "ami_id" {}
resource "aws_instance" "web" {
ami = var.ami_id
instance_type = "t2.micro"
}
3. Terraform + Docker (Containers on Demand)
Terraform can directly provision and manage Docker containers using the docker
provider.
Use Case:
Run local Docker containers using Terraform
Manage container lifecycle in IaC style
Provider Configuration:
provider "docker" {}
Create a Container:
resource "docker_image" "nginx" {
name = "nginx:latest"
}
resource "docker_container" "web" {
name = "nginx_container"
image = docker_image.nginx.latest
ports {
internal = 80
external = 8080
}
}
Run it:
terraform init
terraform apply
Combining All Three: End-to-End DevOps Flow
Packer → Creates image
↓
Terraform → Provisions instance with that image
↓
Ansible → Installs apps, configures services
Real-World Example:
Tool | Role |
Packer | Build hardened AMI w/ updates |
Terraform | Spin up VPC, ALB, EC2 |
Ansible | Set up app servers, nginx, SSL |
Docker (optional) | Run containers locally or in cloud |
Best Practices
Area | Best Practice |
Ansible + Terraform | Use local-exec or run Ansible separately post-TF |
Packer | Version and test AMIs per environment |
Docker + TF | Great for dev/test setups; use ECS/K8s for prod |
Orchestration | Use CI/CD to trigger full flow |
Cheat Sheet
Combo | Benefit |
Terraform + Ansible | Infra + software configuration |
Terraform + Packer | Immutable, fast booting images |
Terraform + Docker | Local container-based infra |
All 3 combined | Full provisioning pipeline |
Multi-Cloud Deployments with Terraform
How to Provision AWS, Azure, and GCP in a Single Terraform Project
In modern organizations, infrastructure may span multiple cloud providers — for cost optimization, compliance, redundancy, or business strategy. Terraform’s provider-agnostic architecture makes it ideal for managing multi-cloud environments in a unified way.
What Is Multi-Cloud in Terraform?
Multi-cloud Terraform means using multiple provider
blocks to manage resources across AWS, Azure, GCP, etc., from the same Terraform configuration or project.
Use cases:
Deploy same architecture in different clouds
Federate services across clouds (e.g., AWS DB + GCP frontend)
Maintain separate environments per cloud
1. Folder Structure for Multi-Cloud Projects
terraform-multicloud/
├── providers.tf
├── main.tf
├── variables.tf
├── outputs.tf
├── modules/
│ ├── aws_webapp/
│ ├── azure_storage/
│ └── gcp_compute/
└── environments/
├── dev/
├── prod/
Each module handles one cloud. You orchestrate across them in main.tf
.
2. Defining Multiple Providers
In providers.tf
:
provider "aws" {
region = var.aws_region
alias = "aws"
}
provider "azurerm" {
features = {}
alias = "azure"
}
provider "google" {
credentials = file(var.gcp_credentials_file)
project = var.gcp_project
region = var.gcp_region
alias = "gcp"
}
Use aliases to distinguish providers when using multiple in one file.
3. Multi-Cloud Module Usage
AWS Module:
module "aws_web" {
source = "./modules/aws_webapp"
providers = {
aws = aws
}
instance_type = "t3.micro"
}
Azure Module:
module "azure_blob" {
source = "./modules/azure_storage"
providers = {
azurerm = azurerm.azure
}
resource_group = "tf-rg"
}
GCP Module:
module "gcp_vm" {
source = "./modules/gcp_compute"
providers = {
google = google.gcp
}
zone = "us-central1-a"
}
Important: Credentials & Secrets
Use per-cloud environment variables for secure access:
AWS:
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
Azure:
export ARM_CLIENT_ID=...
export ARM_CLIENT_SECRET=...
export ARM_TENANT_ID=...
export ARM_SUBSCRIPTION_ID=...
GCP:
export GOOGLE_CREDENTIALS="$(cat gcp-key.json)"
Or reference them in a .tfvars
file and load it via:
terraform apply -var-file=secrets.tfvars
Deployment Order Strategy
If dependencies exist across clouds, handle ordering via depends_on
:
resource "google_compute_instance" "frontend" {
...
depends_on = [aws_db_instance.backend]
}
Use data outputs to pass information from one cloud to another (e.g., DB IP from AWS to a GCP VM).
Multi-Cloud Use Cases
Scenario | Cloud A | Cloud B |
Redundant web + DB infra | AWS (primary) | Azure (backup) |
GCP compute, AWS DB combo | AWS RDS | GCP VM frontend |
Dev = AWS, Prod = Azure | Isolated | Isolated |
Storage in Azure, workload in AWS | Azure Blob | AWS Lambda |
Best Practices
Practice | Description |
Use aliases | Avoid collisions in provider blocks |
Environment separation | Isolate dev/staging/prod configs |
Modularize cloud logic | Each cloud = its own module |
Secure credentials | Use secrets managers or CI/CD secrets |
Version lock modules | Pin source versions or Git tags |
Limit blast radius | Apply per-module with separate state |
Cheat Sheet: Multi-Cloud with Terraform
Feature | Example |
Provider alias | alias = "azure" |
Use multiple providers | providers = { aws = aws } |
Output sharing | output "db_ip" { value = ... } |
State isolation | Use different backends per cloud |
Secrets management | Use ENV VARS / secrets vaults |
Bonus: Multi-Cloud Deployment Automation
Orchestrate full multi-cloud plan:
terraform init
terraform plan -out multi.tfplan
terraform apply multi.tfplan
Or split by cloud/module:
cd modules/aws_webapp && terraform apply
cd modules/azure_storage && terraform apply
You may also chain them in CI/CD jobs.
Terraform + Kubernetes + Helm
Managing Kubernetes Clusters and Deployments with Terraform
Terraform can do much more than provision virtual machines and cloud resources — it can also provision and manage Kubernetes clusters, workloads, and Helm charts. This part shows how to combine Terraform + Kubernetes provider + Helm provider to fully automate your K8s stack.
What You’ll Learn
Provision EKS / AKS / GKE clusters with Terraform
Use the Kubernetes provider to deploy resources (pods, services, namespaces)
Use the Helm provider to install charts (e.g., NGINX Ingress, Prometheus, ArgoCD)
1. Kubernetes Provider Overview
The kubernetes
provider allows Terraform to interact with your K8s cluster.
Provider Example:
provider "kubernetes" {
config_path = "~/.kube/config" # or use config from data block
}
You can also dynamically get credentials after provisioning the cluster (EKS, GKE, AKS).
2. Provision Kubernetes Cluster (e.g., AWS EKS)
module "eks" {
source = "terraform-aws-modules/eks/aws"
cluster_name = "my-cluster"
cluster_version = "1.27"
subnet_ids = var.subnet_ids
vpc_id = var.vpc_id
enable_irsa = true
node_groups = {
default = {
desired_capacity = 2
instance_types = ["t3.medium"]
}
}
}
3. Connect Terraform to the Cluster (EKS Example)
Output kubeconfig dynamically:
data "aws_eks_cluster" "cluster" {
name = module.eks.cluster_name
}
data "aws_eks_cluster_auth" "cluster" {
name = module.eks.cluster_name
}
provider "kubernetes" {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
token = data.aws_eks_cluster_auth.cluster.token
}
4. Deploy Kubernetes Resources via Terraform
Create Namespace + Deployment:
resource "kubernetes_namespace" "example" {
metadata {
name = "demo"
}
}
resource "kubernetes_deployment" "nginx" {
metadata {
name = "nginx"
namespace = kubernetes_namespace.example.metadata[0].name
labels = {
app = "nginx"
}
}
spec {
replicas = 2
selector {
match_labels = {
app = "nginx"
}
}
template {
metadata {
labels = {
app = "nginx"
}
}
spec {
container {
name = "nginx"
image = "nginx:1.21"
port {
container_port = 80
}
}
}
}
}
}
5. Installing Helm Charts via Terraform
The helm
provider allows you to deploy Helm charts using Terraform.
Provider Setup:
provider "helm" {
kubernetes {
config_path = "~/.kube/config"
}
}
Install NGINX Ingress Controller:
resource "helm_release" "nginx_ingress" {
name = "nginx-ingress"
namespace = "ingress-nginx"
repository = "https://kubernetes.github.io/ingress-nginx"
chart = "ingress-nginx"
version = "4.9.1"
create_namespace = true
values = [file("nginx-values.yaml")]
}
Typical Flow: Cluster + App
Terraform:
├── Provision VPC, Subnets, IAM, Security Groups
├── Create EKS/GKE/AKS Cluster
├── Get Cluster Credentials
├── Apply K8s Resources (Deployment, Services, Secrets)
└── Install Helm Charts (Ingress, Monitoring, ArgoCD)
Best Practices
Task | Best Practice |
Cluster Bootstrap | Use separate Terraform run for provisioning |
K8s Resource Sync | Use kubectl diff or GitOps alongside TF |
Sensitive Data | Use sensitive = true , secrets in Vault or SSM |
Helm Charts | Pin chart versions, store values in versioned files |
Dev vs Prod Separation | Use workspaces or folders per environment |
Cheat Sheet: Terraform + Kubernetes + Helm
Tool | Role |
aws_eks_* | Provision EKS infrastructure |
kubernetes_* | Define K8s resources like Deployments |
helm_release | Install Helm charts |
provider "kubernetes" | Configure access to cluster |
output | Share cluster endpoint, token, CA cert |
Case Study: End-to-End Terraform + Kubernetes + Helm Deployment
Background
A mid-size SaaS company is building a multi-tenant project management platform that needs:
Automated, secure, scalable cloud infrastructure
High availability across regions
Containerized microservices
CI/CD with GitOps
Monitoring, TLS, and secrets management
Their stack includes:
AWS for infrastructure (EKS, RDS, S3, Route53)
Kubernetes (EKS) for container orchestration
Helm for deploying services like Ingress, ArgoCD, Prometheus
Terraform as the single source of truth for infra
Project Goals
Requirement | Tools Involved |
Provision VPC, EKS, RDS | Terraform (aws_* modules) |
Deploy Kubernetes resources | Terraform + kubernetes provider |
Install Helm charts (Ingress, TLS, Monitoring) | Terraform + helm provider |
GitOps with ArgoCD | Helm Chart deployed via Terraform |
Secrets and state management | AWS SSM + S3 + DynamoDB |
Multi-env setup (dev, staging, prod) | Terraform workspaces |
Step-by-Step Flow
1. VPC and Network Setup (Terraform)
Use AWS VPC module to provision isolated network:
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
name = "platform-vpc"
cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
public_subnets = ["10.0.3.0/24", "10.0.4.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
}
2. EKS Cluster + Node Groups (Terraform)
module "eks" {
source = "terraform-aws-modules/eks/aws"
cluster_name = "platform-eks"
cluster_version = "1.27"
subnet_ids = module.vpc.private_subnets
vpc_id = module.vpc.vpc_id
node_groups = {
default = {
desired_capacity = 3
instance_types = ["t3.medium"]
}
}
}
3. Kubernetes Provider Setup
Use dynamic credentials from the created EKS cluster:
data "aws_eks_cluster" "eks" {
name = module.eks.cluster_name
}
data "aws_eks_cluster_auth" "eks" {
name = module.eks.cluster_name
}
provider "kubernetes" {
host = data.aws_eks_cluster.eks.endpoint
token = data.aws_eks_cluster_auth.eks.token
cluster_ca_certificate = base64decode(data.aws_eks_cluster.eks.certificate_authority[0].data)
}
4. Helm Provider Setup
provider "helm" {
kubernetes {
host = data.aws_eks_cluster.eks.endpoint
token = data.aws_eks_cluster_auth.eks.token
cluster_ca_certificate = base64decode(data.aws_eks_cluster.eks.certificate_authority[0].data)
}
}
5. Helm: Install Ingress Controller + Cert Manager
resource "helm_release" "nginx_ingress" {
name = "nginx-ingress"
namespace = "ingress-nginx"
repository = "https://kubernetes.github.io/ingress-nginx"
chart = "ingress-nginx"
version = "4.9.1"
create_namespace = true
}
resource "helm_release" "cert_manager" {
name = "cert-manager"
namespace = "cert-manager"
repository = "https://charts.jetstack.io"
chart = "cert-manager"
version = "v1.13.1"
create_namespace = true
set {
name = "installCRDs"
value = "true"
}
}
6. Kubernetes: Deploy Application Namespace + Secrets
resource "kubernetes_namespace" "app" {
metadata {
name = "project-app"
}
}
resource "kubernetes_secret" "app_secret" {
metadata {
name = "db-credentials"
namespace = "project-app"
}
data = {
username = base64encode("prod_user")
password = base64encode("s3cr3t123")
}
}
7. App Helm Chart Deployment
Assuming app team provides a Helm chart:
resource "helm_release" "project_app" {
name = "project-app"
namespace = "project-app"
chart = "./charts/project-app"
set {
name = "replicaCount"
value = 3
}
set {
name = "env.DATABASE_URL"
value = "postgres://prod_user:s3cr3t123@db.project.local:5432/prod"
}
}
8. Monitoring + GitOps Stack via Helm
resource "helm_release" "prometheus" {
name = "kube-prometheus-stack"
namespace = "monitoring"
repository = "https://prometheus-community.github.io/helm-charts"
chart = "kube-prometheus-stack"
version = "48.0.1"
create_namespace = true
}
resource "helm_release" "argocd" {
name = "argocd"
namespace = "argocd"
repository = "https://argoproj.github.io/argo-helm"
chart = "argo-cd"
version = "5.46.5"
create_namespace = true
}
Secrets & State Handling
Remote state:
S3
with versioning +DynamoDB
lock tableSecrets: passed from AWS SSM or Vault → Terraform → Helm values
Sensitive variables:
variable "db_password" { type = string sensitive = true }
CI/CD Pipeline Steps
1. terraform init
2. terraform fmt && terraform validate
3. terraform plan -out=plan.tfplan
4. terraform apply plan.tfplan
5. Trigger Helm releases or re-run terraform if modules updated
6. Sync ArgoCD apps for GitOps-based workloads
Monitoring & Observability
Prometheus + Grafana installed via Terraform
Dashboards are auto-imported via config maps
TLS via Cert Manager and DNS challenge with Route53
Production Safeguards
Environments isolated via workspaces:
terraform workspace new prod terraform workspace select prod
apply
only allowed afterplan
approval in CI/CDTerraform and ArgoCD run in tandem: Terraform for infra, ArgoCD for app state
Summary: Full Infra Stack Built with Terraform
Layer | Tools |
Cloud Infra (VPC, EKS) | Terraform + AWS modules |
App Platform | Terraform + Helm + Kubernetes |
Secrets | Terraform + SSM/Vault |
Monitoring | Prometheus, Grafana (via Helm) |
GitOps | ArgoCD via Helm, synced from Git |
Security | TLS (Cert Manager), IAM roles |
Auditability | Git commits + remote state versioning |
Subscribe to my newsletter
Read articles from Ahmad W Khan directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
