Step-by-Step Guide: Provisioning EKS with Dynamic EBS Volumes Using Terraform

Daawar PanditDaawar Pandit
7 min read

In this post, I’ll show you how to provision an Amazon EKS cluster with dynamic EBS volume support using Terraform. We’ll cover every step: VPC setup, IAM, EKS add-ons, and testing persistent storage with Kubernetes manifests. This guide is ideal for anyone looking to run stateful workloads on AWS EKS with best practices.

Link to Repo: eks-ebs-dynamic-provisioning

Table of Contents

  1. Introduction

  2. Project Overview

  3. VPC and Networking

  4. EKS Cluster and Node Groups

  5. IAM for EBS CSI Driver

  6. EBS CSI Driver Add-on

  7. Kubernetes Storage Manifests

  8. Testing and Verification Process

  9. Best Practices & Considerations

  10. Conclusion

Introduction

Running stateful applications—like databases or persistent caches—on Kubernetes requires durable, performant storage. AWS EBS is the go-to block storage solution for EKS. In this hands-on guide, you’ll learn how to provision a production-ready EKS cluster with dynamic EBS support using Terraform and the AWS EBS CSI driver. We'll cover everything from VPC setup and IAM configuration to Kubernetes manifests and real-world verification.

Project Overview

This project demonstrates how to:

  1. Provision a complete EKS cluster infrastructure using Terraform.

  2. Configure and enable AWS EBS CSI (Container Storage Interface) driver.

  3. Create the necessary IAM permissions for EBS volume management.

  4. Deploy Kubernetes manifests to utilize EBS storage in applications.

  5. Test persistent storage with a StatefulSet application.

VPC and Networking

We use the official Terraform AWS VPC module to create a VPC with three public and three private subnets, NAT gateway, and all the right tags for EKS and autoscaler discovery.

Key points:

  • Three public and three private subnets for high availability.

  • NAT gateway for outbound internet access from private subnets.

  • Subnet tags for EKS and cluster autoscaler discovery.

Terraform File: vpc.tf

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  name    = "${var.cluster_name}-vpc"
  cidr    = var.vpc_cidr
  azs     = local.azs
  private_subnets = [for k, v in local.azs : cidrsubnet(var.vpc_cidr, 4, k)]
  public_subnets  = [for k, v in local.azs : cidrsubnet(var.vpc_cidr, 8, k + 48)]
  enable_nat_gateway   = true
  single_nat_gateway   = true
  enable_dns_hostnames = true
  enable_dns_support   = true
  public_subnet_tags  = {
    "kubernetes.io/role/elb" = 1
    "kubernetes.io/cluster/${var.cluster_name}" = "shared"
  }
  private_subnet_tags = {
    "kubernetes.io/role/internal-elb" = 1
    "kubernetes.io/cluster/${var.cluster_name}" = "shared"
    "k8s.io/cluster-autoscaler/enabled" = "true"
    "k8s.io/cluster-autoscaler/${var.cluster_name}" = "owned"
  }
  tags = merge(var.tags, { "Name" = "${var.cluster_name}-vpc" })
}

EKS Cluster and Node Groups

We use the terraform-aws-modules/eks/aws module to provision the EKS cluster and managed node groups.

Highlights:

  • Both public and private API endpoints enabled.

  • Managed node groups with autoscaling (min: 2, max: 5, desired: 3).

  • IRSA (IAM Roles for Service Accounts) enabled for secure add-on integration.

  • EKS add-ons (CoreDNS, kube-proxy, VPC CNI, EBS CSI driver, Pod Identity Agent).

Terraform File: main.tf

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  cluster_name    = var.cluster_name
  cluster_version = var.cluster_version
  vpc_id          = module.vpc.vpc_id
  subnet_ids      = module.vpc.private_subnets
  enable_irsa     = true
  cluster_endpoint_public_access  = true
  cluster_endpoint_private_access = true
  cluster_addons = {
    coredns = { most_recent = true }
    kube-proxy = { most_recent = true }
    vpc-cni = { most_recent = true }
    aws-ebs-csi-driver = { most_recent = true }
    eks-pod-identity-agent = { most_recent = true }
  }
  eks_managed_node_group_defaults = {
    ami_type       = "AL2_x86_64"
    instance_types = var.instance_types
    attach_cluster_primary_security_group = true
    use_custom_launch_template = false
    create_iam_role            = true
    iam_role_additional_policies = {
      AmazonEKSWorkerNodePolicy          = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
      AmazonEKS_CNI_Policy               = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
      AmazonEC2ContainerRegistryReadOnly = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
      AmazonSSMManagedInstanceCore       = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
    }
    vpc_security_group_ids = [aws_security_group.eks_node_group_sg.id]
  }
  eks_managed_node_groups = {
    default_node_group = {
      name = "managed-node-group"
      min_size     = var.min_size
      max_size     = var.max_size
      desired_size = var.desired_size
      enable_monitoring = true
      capacity_type  = "ON_DEMAND"
      tags = var.tags
    }
  }
  manage_aws_auth_configmap = true
  tags = var.tags
}

IAM for EBS CSI Driver

The EBS CSI driver needs permissions to manage EBS volumes. We create a custom IAM policy and attach it to the node group role.

Key permissions:

  • Create, attach, detach, delete, and describe EBS volumes and snapshots.

  • Tagging permissions for volumes and snapshots.

Terraform File: iam.tf

resource "aws_iam_policy" "eks_ebs_csi_policy" {
  name        = "${var.cluster_name}-AmazonEBSCSIDriverPolicy"
  description = "Policy to allow EKS nodes to manage EBS volumes"
  policy      = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "ec2:CreateVolume", "ec2:DeleteVolume", "ec2:DetachVolume", "ec2:AttachVolume",
          "ec2:DescribeVolumes", "ec2:DescribeVolumesModifications", "ec2:DescribeInstances",
          "ec2:CreateSnapshot", "ec2:DeleteSnapshot", "ec2:DescribeSnapshots", "ec2:ModifyVolume"
        ]
        Resource = "*"
      },
      {
        Effect = "Allow"
        Action = ["ec2:CreateTags", "ec2:DeleteTags"]
        Resource = [
          "arn:aws:ec2:*:*:volume/*",
          "arn:aws:ec2:*:*:snapshot/*"
        ]
        Condition = {
          StringEquals = {
            "ec2:CreateAction" = ["CreateVolume", "CreateSnapshot"]
          }
        }
      }
    ]
  })
  tags = var.tags
}

resource "aws_iam_role_policy_attachment" "eks_ebs_csi_policy_attachment" {
  policy_arn = aws_iam_policy.eks_ebs_csi_policy.arn
  role       = module.eks.eks_managed_node_groups["default_node_group"].iam_role_name
}

EBS CSI Driver Add-on

The EBS CSI driver is enabled as an EKS add-on, so you don’t need to install it manually. This is handled in the cluster_addons block:

cluster_addons = {
  aws-ebs-csi-driver = {
    most_recent = true
  }
  # ...other add-ons
}

Why this matters:
This add-on allows Kubernetes to dynamically provision and manage EBS volumes for your pods, using the permissions you set up above.

The integration between EKS and EBS is accomplished through these key components:

  1. AWS EBS CSI Driver Add-on: Installed via the EKS cluster add-ons in Terraform

  2. IAM Permissions: Custom policy attached to node group role

  3. IRSA (IAM Roles for Service Accounts): Enabled via enable_irsa = true in your EKS module

Kubernetes Storage Manifests

To test dynamic provisioning, we use the following manifests (K8s-Manifests).

1. StorageClass (storage-class.yaml)

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ebs-sc
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  fsType: ext4
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer

This StorageClass defines:

  • The EBS CSI driver as the provisioner (ebs.csi.aws.com)

  • GP3 volume type for better performance/cost ratio

  • Ext4 as the filesystem

  • Volume expansion capabilities

  • WaitForFirstConsumer binding mode (volumes are only created when a pod claims them)

2. PersistentVolumeClaim (persistent-volume-claim.yaml)

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ebs-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: ebs-sc
  resources:
    requests:
      storage: 10Gi

This PVC requests a 10GB volume using your StorageClass. Important notes:

  • EBS volumes can only be attached to one node at a time (ReadWriteOnce)

  • The claim references your custom StorageClass ebs-sc

3. StatefulSet Application (statefulset.yaml)

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web-app
spec:
  replicas: 2
  # Other configuration
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: ebs-sc
      resources:
        requests:
          storage: 10Gi

This StatefulSet:

  • Creates an NGINX application with 2 replicas

  • Uses volumeClaimTemplates to dynamically provision EBS volumes

  • Mounts the volumes at /usr/share/nginx/html

  • Each pod gets its own dedicated EBS volume that persists across pod restarts.

4. Headless Service (service.yaml)

apiVersion: v1
kind: Service
metadata:
  name: web-app
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None
  selector:
    app: web-app

This headless service creates stable network identities for your StatefulSet pods.

Testing and Verification Process

  1. Provision the infrastructure:
terraform init
terraform plan
terraform apply
  1. Configure kubectl:
aws eks --region ap-south-1 update-kubeconfig --name Daawar-eks-cluster
  1. Apply the manifests:
kubectl apply -f KS8-manfifests/storage-class.yaml
kubectl apply -f KS8-manfifests/statefulset.yaml
kubectl apply -f KS8-manfifests/service.yaml

It is crucial to verify that the dynamically provisioned EBS volume is functioning as expected. After deploying the manifests, you should:

  1. Check PVC and PV Status:

    • Ensure the PersistentVolumeClaim (PVC) is bound to a PersistentVolume (PV):

        kubectl get pvc
        kubectl get pv
      
    • The status should be Bound.

  2. Validate Pod Storage Access:

    • Exec into one of the pods in the StatefulSet and write data to the mounted volume:

        kubectl exec -it <pod-name> -- /bin/sh
        echo "hello world" > /data/testfile
        cat /data/testfile
      
    • The file should be readable and persistent.

  3. Test Data Persistence:

    • Delete the pod and ensure the data remains after the pod is recreated:

        kubectl delete pod <pod-name>
        # Wait for the pod to restart, then exec again and check the file
        kubectl exec -it <new-pod-name> -- cat /data/testfile
      
    • The file should still exist, confirming EBS persistence.

Note: These steps help ensure your EBS-backed storage is correctly provisioned and persistent across pod restarts, which is essential for stateful workloads.

Best Practices & Considerations

  • EBS volumes are AZ-specific: Pods must be scheduled in the same AZ as their volume.

  • Use Retain reclaim policy with caution: Volumes are not deleted when PVCs are removed, which can incur costs.

  • Security: Restrict SSH access in production and use least-privilege IAM.

  • Monitoring: Add monitoring and backup for production workloads.

Conclusion

By following this guide, you’ve learned how to automate the deployment of a production-ready EKS cluster with dynamic EBS storage using Terraform. This approach not only streamlines infrastructure management but also empowers your Kubernetes workloads with scalable, persistent storage—crucial for running databases, stateful applications, and more in the cloud.

With infrastructure as code, you gain repeatability, version control, and the ability to collaborate seamlessly. Integrating AWS EBS with EKS via the CSI driver unlocks the full power of Kubernetes for stateful workloads on AWS.

Ready to take it further? You can extend this setup with monitoring, backup, and disaster recovery, or adapt it for multi-environment deployments. If you found this guide helpful, share it with your network or leave a comment below—I’d love to hear about your experience!

0
Subscribe to my newsletter

Read articles from Daawar Pandit directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Daawar Pandit
Daawar Pandit

Hello, I'm Daawar Pandit, an aspiring DevOps engineer with a robust background in search quality rating and a passion for Linux systems. I am dedicated to mastering the DevOps toolkit, including Docker, Kubernetes, Jenkins, and Terraform, to streamline deployment processes and enhance software integration. Through this blog, I share insights, tips, and experiences in the DevOps field, aiming to contribute to the tech community and further my journey towards becoming a proficient DevOps professional. Join me as I delve into the dynamic world of DevOps engineering.