Store Terraform state remotely on S3 bucket

In my previous post, I mentioned a bit about the better way to store your Terraform state file. Instead of keeping it on your own local computer, it's better to keep it remotely. This is especially useful when working with others on different computers.

In this blog post, I'll show you how to set up your Terraform remote state storage using Amazon S3 bucket and DynamoDB table within your existing Terraform project. These will help manage your Terraform state remotely alongside other project resources.

Let's dive into each step together! ๐Ÿคฟ

Overview ๐ŸŒ

This is actually the chicken-and-egg situation of using Terraform to create the backend resources where you want to store your Terraform state on themselves. To make this work, you have to use a two-step process:

  1. Write Terraform code to create a S3 bucket and a DynamoDB table, and deploy that code with a local state file (local backend).

  2. Go back to the Terraform code, add a s3 remote backend configuration to it to use the newly created S3 bucket and DynamoDB table as a backend state storage, and run terraform init again to move your local state file into S3 bucket.

Later if you would like to destroy the entire stack including both S3 bucket and DynamoDB table for remote backend, you need to do another two-step process in reverse as follows:

  1. Go to the Terraform code, remove the backend configuration, and run terraform init -migrate-state to copy the Terraform state back to your local disk.

  2. Run terraform destroy to delete the S3 bucket and DynamoDB table along with other resources from the entire stack.

With this roadmap in mind, let's begin our exploration of each step in detail.

NOTE: I have already developed Terraform configurations here to illustrate the steps outlined above in detail. You can visit my repository, create a codespace based on it, and conduct your workshop there.

Migration ๐Ÿšš

In this section, I will explain how to modify the existing Terraform code project to transition the Terraform state information from the local state file to the S3 bucket, which will also be defined within the same project. Here's how:

Step 1: Define & Apply additional resources for backend

First, in the Terraform code, I define a S3 bucket as a remote location to store the Terraform state file. Here are resource blocks that I configure for this S3 bucket in my repository:

########################
# Secure state storage #
########################

resource "aws_s3_bucket" "backend_bucket" {
  bucket = "terraform-backend-<random-string>"

  # After migrating the remote state file back to the local one, the remote state file still exists on the backend bucket.
  # So, with `force_destroy = true`, we can run `terraform destroy` to destroy the backend bucket even the remote state file still exists.
  force_destroy = true

  tags = {
    Name = "S3 Remote Terraform State Store"
  }
}

resource "aws_s3_bucket_versioning" "backend_bucket_versioning" {
  bucket = aws_s3_bucket.backend_bucket.id

  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_lifecycle_configuration" "backend_bucket_lifecycle" {
  bucket = aws_s3_bucket.backend_bucket.id

  rule {
    id = "noncurrent-version-expiration-rule"

    noncurrent_version_expiration {
      # NOTE: The number of days an object is noncurrent before Amazon S3 can perform the associated action.
      noncurrent_days = 7

      # NOTE: The number of noncurrent versions Amazon S3 will retain.
      # newer_noncurrent_versions = 10
    }

    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "backend_bucket_sse" {
  bucket = aws_s3_bucket.backend_bucket.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256" # NOTE: Using server-side encryption with Amazon S3-managed encryption keys (SSE-S3)
    }
  }
}

resource "aws_s3_bucket_public_access_block" "backend_bucket_public_access_block" {
  bucket = aws_s3_bucket.backend_bucket.id

  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

I don't just define the S3 bucket with its default properties; I also enhance its security with additional layers of protection. Each aspect of this S3 bucket can be described as follows:

  • resource "aws_s3_bucket" "backend_bucket"

    • bucket = "terraform-backend-<random-string>": This specifies the name of the S3 bucket. Note that S3 bucket names must be globally unique among all AWS customers, so please choose the <random-string> wisely.

    • force_destroy = true: This ensures that even if the remote state file still exists, running terraform destroy will remove the backend bucket.

  • resource "aws_s3_bucket_versioning" "backend_bucket_versioning": This enables versioning on the S3 bucket, ensuring that every update to a file in the bucket creates a new version. This allows you to view older versions of the remote state file and revert to them if needed, serving as a useful fallback mechanism.

  • resource "aws_s3_bucket_lifecycle_configuration" "backend_bucket_lifecycle": This prevents too many noncurrent versions of the remote state file from being retained.

  • resource "aws_s3_bucket_server_side_encryption_configuration" "backend_bucket_sse": This ensures that your state files, along with any contained secrets, are always encrypted on disk when stored in the S3 bucket.

  • resource "aws_s3_bucket_public_access_block" "backend_bucket_public_access_block": This will block all public access to the S3 bucket. Since Terraform state files may contain sensitive data and secrets, adding this extra layer of protection ensures that no one on your team can accidentally make this S3 bucket public.

If you're collaborating on a project with multiple team members and using the S3 bucket as a remote state backend, DynamoDB is utilized to support the locking mechanism. It contains a single attribute named "LockID," indicating whether operations on the state file can proceed.

During Terraform operations (such as plan, apply, destroy), the state file is locked for the duration of the operation. If another developer attempts to execute operations concurrently, the request is denied. Operations can resume once the current operation is complete, and the lock on the state file is released from the DynamoDB table.

Here's how we define the DynamoDB table for the locking purpose:

#######################
# State locking table #
#######################

resource "aws_dynamodb_table" "backend_state_lock_tbl" {
  name = "terraform-backend-state-locking". # NOTE: You can change the table name as desired.

  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"

  attribute {
    name = "LockID"
    type = "S" # NOTE: String type
  }

  tags = {
    "Name" = "DynamoDB Terraform State Lock Table"
  }
}

NOTE: Setting billing_mode = "PAY_PER_REQUEST" means that we opt for the on-demand capacity mode in the DynamoDB table, where you pay for the actual request usage. This is ideal for scenarios where the table is primarily used for testing purposes.

Let's begin by creating the remote backend resources on AWS and saving the state locally. Run the following commands:

terraform plan -out /tmp/tfplan
terraform apply /tmp/tfplan

You should notice that a terraform.tfstate file is created, containing the state information corresponding to the newly created backend resources. Proceed to the next step to transfer its contents to the remote storage.

Step 2: Configure backend & Reinitiate Terraform project

Next, add a backend configuration for the S3 bucket to your Terraform code. This configuration is specific to Terraform and is placed within a terraform block. Here's how you can structure it:

terraform {

  # ... <Other blocks> ...

  backend "s3" {
    bucket         = "terraform-backend-<random-string>"
    region         = "ap-southeast-1"
    key            = "path/to/remote_terraform.tfstate" # NOTE: Change the object key for the state file as desired.
    # NOTE: `encrypt = true` as a second layer in addition to `backend_bucket_sse` to ensure that the state file is always encrypted on the S3 bucket.
    encrypt        = true
    dynamodb_table = "terraform-backend-state-locking"
  }
}

NOTE: Ensure to replace <random-string> with the same name suffix of the S3 bucket you created earlier.

To instruct Terraform to store your state file in this S3 bucket, you'll need to run the terraform init command again. This command not only downloads provider code but also configures your Terraform backend. Since the init command is idempotent, it's safe to run it multiple times.

Additionally, you need a separate credential to migrate the state file to the S3 bucket. If you haven't specified AWS access keys in environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, you'll need to partially configure the backend block externally in a separate variable file, such as backend.tfvars:

access_key = "<aws-account-access-key>"
secret_key = "<aws-account-secret-key>"

Then, you'll need to add the -backend-config=backend.tfvars option to the terraform init command so that Terraform has the necessary permissions to migrate the state file:

terraform init -backend-config=backend.tfvars

After run the above command, Terraform will recognize that you already have a state file locally and prompt you to copy it to the new S3 backend bucket. Confirm by typing "yes" and your Terraform state will be stored in the S3 bucket. You can verify this by navigating to the S3 bucket on the Amazon S3 web console.

Clean Up ๐Ÿงน

As mentioned earlier, to delete all resources, including the backend resources, you'll also need to perform two separate tasks:

Step 1: Remove backend configuration & Reinitiate Terraform project

Begin by deleting the entire backend "s3" block inside the terraform block from your code. Then, copy the Terraform state information back to your local disk by executing the following command:

terraform init -migrate-state

You'll notice that the state information is now restored to the terraform.tfstate file.

Step 2: Destroy all resources

To proceed with resource deletion on AWS, simply execute the Terraform CLI:

terraform destroy

Conclusion ๐Ÿ

In summary, this blog post has shown you how to set up a remote Terraform state backend using Amazon S3 bucket and DynamoDB table in your existing project. With this setup, you can manage your infrastructure more efficiently and work better with your team.

Happy Terraforming! ๐Ÿค—

References ๐Ÿ”—

0
Subscribe to my newsletter

Read articles from Prasit (O) Sutthikamolsakul directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Prasit (O) Sutthikamolsakul
Prasit (O) Sutthikamolsakul