The Blueprint: My Opinionated Terragrunt Project Structure for Scalable Teams

Atif FarrukhAtif Farrukh
7 min read

Real-World Problem Intro

I’ve walked into too many organizations where the Terraform setup is a ticking time bomb. It starts with the best intentions—isolating components into their own directories. But then the rot sets in. Every single directory has a nearly identical backend.tf and provider.tf file. The state file is a monolithic beast that takes ages to lock, creating a single file line for every developer to make a change. When you need to update the provider version or change a backend setting, it’s a full-day grep and sed adventure across a hundred directories. Reusing code is a joke of copy-pasting directories, which inevitably drift and become snowflakes of misconfiguration. An engineer "forgets" to copy the backend config, initializes locally, and suddenly you have a rogue state file living on their laptop.

You've solved the monolithic state file problem but traded it for death by a thousand papercuts. Your code isn't DRY (Don't Repeat Yourself); it's WET (Write Everything Twice... or twenty times). This boilerplate creep doesn't just waste time; it creates seams for catastrophic errors. Your Infrastructure as Code, meant to be a source of truth, becomes a minefield of configuration drift.

You know the pain. It’s that feeling in your gut when you type terraform apply and have no real confidence in what’s about to break. It's the technical debt that grinds your platform teams to a halt, turning what should be a high-speed IaC highway into a gridlocked nightmare. This isn't just an inconvenience; it's a direct threat to your ability to ship features, stay secure, and scale.

Architecture Context

Before we dive into the code, let's establish the "why." A good Terraform structure isn't about rigid dogma; it's about creating a paved road for your developers. It makes the "right way" the "easy way." The goal is to enable teams to self-serve infrastructure safely and efficiently without needing to understand the entire universe of your cloud environment.

We’re designing for three core principles

  1. Isolation: An engineer deploying a change to the user-service in staging should have zero ability to accidentally torpedo the production database. This means separating state files, credentials, and configuration per environment and component.

  2. Reusability: We build components once and reuse them everywhere. An S3 bucket module or a Kubernetes EKS cluster module should be a standardized, battle-tested building block, not something reinvented for every new project.

  3. Clarity: Anyone on the team should be able to look at the repository and understand exactly where to find the code for a specific component in a specific environment. The structure itself should be self-documenting.

Here is the high level architecture of the repository structure we’re aiming for.

Implementation Details

This is my blueprint. Its been forged into fires of productions deployments, failed audits and rapid scaling. The structure is designed to be managed by multiple team members without constant merge conflicts

Directory Structure

Let's break down the repository layout:

├── environments
│   ├── _shared
│   │   └── common.hcl
│   ├── dev
│   │   ├── env.hcl
│   │   ├── region.hcl
│   │   ├── compute
│   │   │   └── eks
│   │   │       └── terragrunt.hcl
│   │   └── data
│   │       └── rds
│   │           └── terragrunt.hcl
│   └── prod
│       ├── env.hcl
│       ├── region.hcl
│       └── ...
├── modules
│   ├── aws
│   │   ├── compute
│   │   │   └── eks-cluster
│   │   │       ├── main.tf
│   │   │       ├── variables.tf
│   │   │       └── outputs.tf
│   │   └── data
│   │       └── rds
│   │           └── ...
├── terragrunt.hcl
├── .gitignore
  • terragrunt.hcl (root): This is the brains of the entire operation. This single file defines your remote state backend for every component.

  • modules/: Your library of reusable Terraform modules. No Terragrunt files live here.

  • environments/: The actual infrastructure.

    • .hcl files: They hold the variable inputs for their respective scopes (common, region, env).

    • terragrunt.hcl (component): It’s a tiny file that tells Terragrunt which module to use and what inputs to pass.

The Root terragrunt.hcl

Magic happens here. This file at the repository root defines the backend and variable loading strategy for everything below it.

# terragrunt.hcl

# Configure the S3 backend once for all modules.
remote_state {
  backend = "s3"
  generate = {
    path      = "backend.tf"
    if_exists = "overwrite_terragrunt"
  }
  config = {
    # Replace with your S3 bucket and DynamoDB table
    bucket         = "devops-unlocked-tfstate"
    key            = "${path_relative_to_include()}/terraform.tfstate" # Dynamic key path
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "devops-unlocked-tf-locks"
  }
}

# This block tells Terragrunt to automatically load and merge all variables
# from the HCL files it finds in the parent directories.
inputs = merge(
  local.common_vars.locals,
  local.region_vars.locals,
  local.env_vars.locals,
)

# Helper blocks to find and read the variable files
locals {
  common_vars = read_terragrunt_config(find_in_parent_folders("environments/_shared/common.hcl"), {})
  region_vars = read_terragrunt_config(find_in_parent_folders("region.hcl"), {})
  env_vars    = read_terragrunt_config(find_in_parent_folders("env.hcl"), {})
}

The "Live" Configuration

This is all that's needed in environments/dev/compute/eks/terragrunt.hcl. This makes this clean.

# environments/dev/compute/eks/terragrunt.hcl

# Inherit all the settings from the root terragrunt.hcl file.
include "root" {
  path = find_in_parent_folders()
}

# Point to the actual Terraform module in your library.
terraform {
  source = "../../../../modules/aws/compute/eks-cluster"
}

# Define only the inputs specific to THIS component.
# All common inputs (tags, env name, etc.) are inherited automatically.
inputs = {
  instance_type   = "t3.large"
  cluster_version = "1.28"
  min_size        = 2
}

No backend block. No provider block. Just a point to module and the required lines to configure the EKS cluster. This paved the road.

Architect's Note

To truly lock this down, use different root terragrunt.hcl files for your foundational versus application infrastructure. Your foundations directory, which defines your AWS Organization, IAM roles, and the Terraform state bucket itself, should have its own terragrunt.hcl with its own state. Application teams should only have access to include the application-level root config. This creates a powerful security boundary that prevents a misconfigured web app deployment from modifying a core IAM policy.

Pitfalls & Optimisations

Terragrunt is powerful, but it's not a silver bullet. Here's where people get it wrong.

  1. The Learning Curve: Terragrunt's HCL functions (find_in_parent_folders, dependency, etc.) are not native Terraform. Your team needs to understand what they do. Don't let Terragrunt become "magic" that no one can debug.

  2. Overly Complex Hierarchies: It can be tempting to create dozens of env.hcl files at different levels. Keep it simple: have one for shared, one for the region, and one for the environment (dev/prod).

  3. Optimisation: Deploying an Entire Environment: The run --all command is your best friend. From the environments/dev directory, you can run terragrunt run --all plan or terragrunt run --all apply to deploy every single component in that environment in the correct order.

  4. Optimisation: Sharing Outputs with dependency blocks: This is the killer feature. If your EKS cluster needs the VPC ID from your networking component, you don't hardcode it. You add a dependency block.

# In environments/dev/compute/eks/terragrunt.hcl

dependency "vpc" {
  config_path = "../../network/vpc"
}

inputs = {
  vpc_id = dependency.vpc.outputs.vpc_id
  # ... other inputs
}

Terragrunt will automatically run terraform output on the VPC component and make the value available. It also builds a dependency graph, ensuring the VPC is always deployed before the EKS cluster.

Unlocked: Your Key Takeaways

  • Embrace a Wrapper: For serious scale, vanilla Terraform leads to massive boilerplate. Use Terragrunt to keep your code DRY.

  • Centralize Your Backend: Define your remote state in one root terragrunt.hcl file. Let Terragrunt handle the rest.

  • Use Hierarchical Config: Structure your variables in common.hcl, region.hcl, and env.hcl files to keep component configurations lean.

  • Master dependency Blocks: Use dependency blocks to share outputs between components. This creates a clear, explicit dependency graph.

  • Deploy with run --all: Manage entire environments with a single command (terragrunt run --all apply), letting Terragrunt handle the execution order.

  • Test Your Infrastructure Code: Apply software testing principles to your Terraform modules to ensure reliability and correctness.

Stop fighting Terraform's boilerplate. A disciplined structure powered by Terragrunt is the key to building a secure, scalable, and maintainable cloud foundation.

Connect and Collaborate

If your team is facing the chaos of an unstructured IaC setup, I specialize in architecting these secure, audit-ready systems. You can find my work history and invite me to collaborate on my personal Upwork profile or connect with me on LinkedIn.

For more DevOps tips, article updates, and to join a community of builders, follow the official @DevOps_Unlocked account on Twitter!

0
Subscribe to my newsletter

Read articles from Atif Farrukh directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Atif Farrukh
Atif Farrukh