Stop Exposing Secrets: A Step-by-Step Guide to Securing Terraform State with AWS S3 and DynamoDB


Every DevOps engineer knows this story: You’re building infrastructure with Terraform. Things move fast. But then you realize your terraform.tfstate
file—containing IAM ARNs, secrets, database connection strings, and even cloud resource IDs—is sitting unprotected in a Git repo or some shared drive. This isn’t just a rookie mistake. I’ve seen it at scale—startups and enterprises both—leading to critical data leaks and security breaches.
Managing infrastructure as code is a non-negotiable part of modern DevOps, but leaving your state management as an afterthought is professional negligence. When you work alone, maybe you can get away with it. When you work on a team, it’s the first thing that will grind your workflow to a halt, creating merge conflicts from hell and state files that are perpetually out of sync. Let's fix this, permanently. This is my blueprint for a secure, scalable, and collaborative Terraform backend using AWS.
Architecture Context
Before we write a single line of code, let's understand the strategy. We're not just moving a file; we're building a system. By default, Terraform stores its state file locally. This is untenable for any serious project for two reasons:
Collaboration: Without a shared state file, your teammates can't see your infrastructure changes, leading to conflicts and resource duplication.
Security: Local state files often contain sensitive data in plain text—database passwords, application keys, you name it.
Our architecture will solve both problems by using two core AWS services:
AWS S3 (Simple Storage Service): This will be the durable, centralized home for our
terraform.tfstate
file. We'll lock it down, enable versioning for rollback capabilities, and enforce encryption.AWS DynamoDB: This will act as our locking mechanism. When one engineer runs
terraform apply
, DynamoDB puts a lock on the state file. If another engineer tries to run it simultaneously, they'll get a clear "State Locked" message. This prevents race conditions and state corruption, which can be catastrophic in complex environments.
Here's what it looks like at a high level:
+----------------------+
| Engineer / CI/CD |
+----------------------+
|
| User runs `terraform apply`
v
+----------------------+
| Terraform CLI |
+----------------------+
|
+-------------------------------------------------------------------------+
| AWS Cloud |
| |
| +--------------------------------+--------------------------+ |
| | | | |
| | --- 1. Attempt Lock --------> +--------------------+ | |
| | (Fails if already locked) | DynamoDB Table | | |
| | | (For Locks) | | |
| | <-- (Lock Acquired) --------- +--------------------+ | |
| | ^ | |
| | | | |
| | 4. Release Lock | |
| | | | |
| +<-----------------------------------------+----------------+ |
| | | |
| | --- 2. Read State File ------> +--------------------+ | |
| | | S3 Bucket | | |
| | <-- (Returns terraform.tfstate) | (For State File) | | |
| | +--------------------+ | |
| | ^ | |
| | | | |
| | 3. Write Updated State | |
| | | | |
+-------+------------------------------------------+----------------------+
^ |
| |
(Terraform computes changes and applies them locally)
Workflow Explained:
Attempt Lock: Before performing any action, Terraform sends a request to the DynamoDB table to acquire a lock. If another process already holds the lock, this step fails, preventing concurrent runs.
Read State: Once the lock is acquired, Terraform reads the current
terraform.tfstate
file from the S3 Bucket.Write State: After successfully applying the changes, Terraform writes the new, updated state file back to the S3 Bucket.
Release Lock: Finally, Terraform removes the lock entry from the DynamoDB table, allowing other engineers or processes to run Terraform
This setup is the industry standard for a reason: it's robust, secure, and built on battle-tested AWS services.
Implementation Details
Talk is cheap. Let's build it. We'll use Terraform to provision the very resources needed to manage our Terraform state. It's a bit meta, but it's the right way to ensure the entire setup is reproducible and managed as code from day one.
#### Step 1: Create the S3 Bucket for State Storage
First, we define the S3 bucket. This isn't just any bucket; it's a fortress. We're enabling server-side encryption by default, blocking all public access, and turning on versioning to facilitate recovery from unintended modifications.
Create a file named terraform_backend.tf
:
Terraform
# terraform_backend.tf
resource "aws_s3_bucket" "terraform_state" {
bucket = "devops-unlocked-tfstate-bucket" # Use a unique name!
# Prevent accidental deletion of the state file bucket
lifecycle {
prevent_destroy = true
}
}
resource "aws_s3_bucket_versioning" "terraform_state_versioning" {
bucket = aws_s3_bucket.terraform_state.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_server_side_encryption_configuration" "terraform_state_sse" {
bucket = aws_s3_bucket.terraform_state.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}
resource "aws_s3_bucket_public_access_block" "terraform_state_pab" {
bucket = aws_s3_bucket.terraform_state.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
Run terraform init
and terraform apply
to create these resources in a temporary, local state. We'll migrate this state in the final step.
#### Step 2: Create the DynamoDB Table for State Locking
Next, we provision the DynamoDB table. The only requirement for a Terraform lock table is a primary key named LockID
of type String. We don't need any fancy provisioning; on-demand capacity is perfect for this use case.
Add this to your terraform_backend.tf
file:
Terraform
# terraform_backend.tf (continued)
resource "aws_dynamodb_table" "terraform_locks" {
name = "devops-unlocked-terraform-locks"
billing_mode = "PAY_PER_REQUEST"
hash_key = "LockID"
attribute {
name = "LockID"
type = "S"
}
}
s3:GetObject
, s3:PutObject
, and s3:DeleteObject
on the state file objects, and dynamodb:GetItem
, dynamodb:PutItem
, and dynamodb:DeleteItem
on the lock table. Nothing more. Do not give your CI/CD pipeline or engineers s3:*
on this bucket. That's how state files get leaked.Step 3: Configure the Terraform Backend
Now that our S3 bucket and DynamoDB table exist, we can tell our main Terraform project to use them. In your primary Terraform project (not the one we just used to create the backend), add the backend
configuration block.
Create a file named backend.tf
:
# backend.tf
terraform {
backend "s3" {
bucket = "devops-unlocked-tfstate-bucket" # Must match the bucket name you created
key = "global/terraform.tfstate" # The path to the state file within the bucket
region = "us-east-1" # The region where you created the resources
dynamodb_table = "devops-unlocked-terraform-locks"
encrypt = true
}
}
After adding this block, run terraform init
. Terraform will detect the local state file and the new backend configuration. It will ask if you want to migrate your state to the new S3 backend. Type yes
.
The fundamental setup is now complete. Your state is now stored securely in S3
, with locking managed by DynamoDB.
Pitfalls & Optimisations
Getting the basics right is a huge win, but senior engineers think about failure modes and optimization. Here's where people get tripped up:
Forgetting
prevent_destroy
: I've seen a junior engineer accidentally runterraform destroy
on the state management infrastructure itself. Theprevent_destroy = true
lifecycle block is a crucial but powerful guardrail.Using a Single State File: The
key
in the backend configuration is your friend. Don't dump your entire organization's infrastructure into one massive state file. This is a performance and security nightmare. Break it down logically. A good pattern iskey = "networking/vpc/terraform.tfstate"
orkey = "services/my-app/prod/terraform.tfstate"
. Use Terraform workspaces to manage environments (dev, staging, prod) within a single configuration.IAM Misconfigurations: As mentioned in the Architect's Note, overly permissive IAM policies are the biggest threat. Audit these policies regularly. Assume the identity running Terraform could be compromised and limit the blast radius.
Cross-Region Disaster Recovery: For mission-critical systems, consider enabling S3 Cross-Region Replication on your state bucket. If your primary AWS region goes down, you have a read-only copy of your state file, which can be invaluable for recovery analysis.
Unlocked: Your Key Takeaways
Never Use Local State: Storing
terraform.tfstate
locally is insecure and prevents collaboration. It's a sign of an amateur setup.S3 is for State, DynamoDB is for Locking: Use S3 for durable, encrypted storage of your state file. Use DynamoDB to prevent concurrent executions and state corruption.
Automate the Backend: Provision your backend resources using a separate, minimal Terraform project to ensure your entire setup is managed as code.
Lock Down IAM: Your state backend is your most sensitive piece of infrastructure. Apply the principle of least privilege with surgical precision to the IAM roles that can access it.
Structure Your State: Use a logical key structure and workspaces to avoid a monolithic state file. This improves performance, reduces blast radius, and makes your infrastructure easier to reason about.
Securing your Terraform state isn't just a best practice; it's a foundational requirement for building professional, production-grade infrastructure.
If your team is facing this challenge, I specialize in architecting these secure, audit-ready systems.
Email me for a strategic consultation: atif@devopsunlocked.dev
Explore my projects and connect on Upwork
Subscribe to my newsletter
Read articles from Atif Farrukh directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
