Fix AWS CDK Circling Reference Issues

This article discusses resolving cyclic reference issues in AWS CDK, particularly in complex infrastructures with multiple stacks like NetworkStack, DatabaseStack, and BackendStack. It focuses on overcoming security group dependency loops between these stacks, with a practical example using AWS CDK code snippets to manage security groups for an Aurora Database and ECS Fargate services. Through detailed security configurations and best practices, the solution enhances scalability, security, and maintainability in cloud deployments.

Nowadays, most software companies optimize development workloads to save time on rolling updates and build infrastructure as code using Terraform and AWS CDK.

As a DevOps Engineer, I have frequently encountered tasks requiring the automation of infrastructure and deployment processes. A few months ago, my project manager tasked me with building infrastructure on AWS using AWS CDK. Although I initially had limited experience with AWS CDK, I embarked on the challenge to create scalable and maintainable CDK code. While developing, I focused on leveraging best practices to ensure efficiency and reliability.

The problem is that Cycling reference issues, which is not just error, only exists in AWS CDK.

What is a cycle reference issue in AWS CDK?

A cyclic reference issue in AWS CDK (Cloud Development Kit) arises when there's a circular dependency between stacks or resources, where a resource depends on another, and the second resource depends back on the first. This creates a loop that CloudFormation, which CDK uses to deploy infrastructure, can't resolve because it can't determine the correct order of creation.

In my situation, there are multiple CDK stacks, each with dependencies on each other, like NetworkStack, DatabaseStack, and BackendStack. The main issue comes from the security group configuration between the DatabaseStack and BackendStack. We use Aurora Serverless v2, kept in a private subnet of the VPC for our database. Meanwhile, our backend server is deployed on ECS. The Database Security Group only allows traffic from ECS tasks on the same private subnet. Additionally, the backend API endpoint is auto-scaled by an Application Load Balancer and finally exposed by an API gateway.

While setting up the security groups for each resource (Aurora Serverless V2, ECS, ALB), I ran into cyclic reference errors between the DatabaseStack and BackendStack. After a lot of effort and analysis, I found a solution to fix these cyclic reference issues. I'd like to share some code snippets.

    // Create a security group for aurora DB serverless v2
    this.databaseSecurityGroup = new SecurityGroup(this, databaseSecurityGroupName, {
        securityGroupName: databaseSecurityGroupName,
        description: "Cness DB Security Group",
        vpc: props.vpc,
        allowAllOutbound: true,
    })

    ........

    // Create the Aurora PostgreSQL Database Cluster in Private Subnets
    this.databaseCluster = new rds.DatabaseCluster(this, dbClusterIdentifier, {
      engine: rds.DatabaseClusterEngine.auroraPostgres({
        version: rds.AuroraPostgresEngineVersion.VER_16_3,
      }),
      vpc: props.vpc,
      securityGroups: [this.databaseSecurityGroup], // Attach the security group
      subnetGroup: subnetGroup,
      enableDataApi: true,
      writer: rds.ClusterInstance.serverlessV2(`${dbInstanceIdentifier}-writer`, { instanceIdentifier: `${dbInstanceIdentifier}-writer` }),
      readers: [rds.ClusterInstance.serverlessV2(`${dbInstanceIdentifier}-reader`, { instanceIdentifier: `${dbInstanceIdentifier}-reader`, scaleWithWriter: true }),],
      serverlessV2MinCapacity: 0,
      serverlessV2MaxCapacity: 2,
      backup: {
        retention: cdk.Duration.days(7),
        preferredWindow: '07:00-07:30',
      },
      cloudwatchLogsExports: ["postgresql"],
      cloudwatchLogsRetention: logs.RetentionDays.ONE_WEEK,
      clusterIdentifier: dbClusterIdentifier,
      copyTagsToSnapshot: true,
      credentials: rds.Credentials.fromSecret(dbAdminSecret), // Generate a secret for Aurora
      deletionProtection: false,
      iamAuthentication: false,
      instanceIdentifierBase: dbInstanceIdentifier,
      preferredMaintenanceWindow: "Sat:17:00-Sat:17:30",
      storageEncrypted: true, // Encrypt storage with a KMS-managed key
    });

The code snippet focuses on setting up a security group for an Aurora Database within the DatabaseStack. Here's a breakdown of its configuration:

VPC Association and Security Group: The database is linked to a specific VPC, and a security group is attached to manage access controls.
Network Organization: A subnet group is defined to organize the network, and the Data API is enabled for easier database interaction.
Instances: The setup includes both a writer and a reader instance, using serverless V2 instances for scalable and cost-effective operations.
Capacity Management: It specifies the minimum and maximum capacity for serverless operations to ensure efficient resource use.
Backup Configuration: Data is retained for seven days, with a designated backup window.
CloudWatch Logs: PostgreSQL logs are exported to CloudWatch, with a retention period set to one week.
Cluster Identification and Tag Management: The cluster has a unique identifier, and tags are copied to snapshots for better resource management.
Security and Maintenance: Database credentials are securely generated from a secret. Storage encryption is enabled using a KMS-managed key, although deletion protection is disabled. Maintenance windows are defined to minimize disruption during updates.

        const ecsSecurityGroup = new ec2.SecurityGroup(this, backendSecurityGroupName, {
            securityGroupName: backendSecurityGroupName,
            vpc: props.vpc,
            allowAllOutbound: true,
            description: 'Security group for ECS Fargate tasks',
        });

        const albSecurityGroup = new ec2.SecurityGroup(this, albSecurityGroupName, {
            securityGroupName: albSecurityGroupName,
            description: "Cness Application Load Balancer Security Group",
            vpc: props.vpc,
            allowAllOutbound: true,
        })

        albSecurityGroup.addIngressRule(
            ec2.Peer.anyIpv4(),
            ec2.Port.tcp(80),
            'Allow HTTP traffic'
        );

        .......

        const fargateService = new ecsPatterns.ApplicationLoadBalancedFargateService(this, `Backend_Service`, {
            serviceName: `Backend_Service`,
            cluster,
            taskDefinition: fargateTaskDefinition,
            publicLoadBalancer: true,
            desiredCount: 1,
            securityGroups: [ecsSecurityGroup],
            assignPublicIp: false,
            enableExecuteCommand: true,
        });

        const backendToDatabaseSecurityGroup = ec2.SecurityGroup.fromSecurityGroupId(
            this,
            'DatabaseSecurityGroup',
            props.databaseSecurityGroup.securityGroupId,
        );

        backendToDatabaseSecurityGroup.addIngressRule(
            fargateService.service.connections.securityGroups[0],
            ec2.Port.tcp(5432),
        );

Create a variable called backendToDatabaseSecurityGroup.
Initialize this variable by referencing the securityGroupId from props.databaseSecurityGroup, created in the Database Stack.
Add ingress rules to the security groups linked with the ECS Fargate service.
Specify that the ECS Fargate service security groups will be the source resource.
Allow secure communication between the backend service and the database.
Ensure only traffic from the specified ECS Fargate service can access the database.
Enhance the security of the application architecture.

After addressing the cyclic reference issues in AWS CDK, we have significantly improved the stability and maintainability of our infrastructure. By carefully managing dependencies and ensuring that each component is properly isolated, we have reduced the risk of deployment failures and improved the overall efficiency of our system.

In conclusion, resolving these issues not only enhances our application's reliability but also streamlines our development process. This allows our team to focus on delivering new features and improvements without being bogged down by infrastructure complexities.

Thank you for your attention to these crucial details. Let's continue to build robust and scalable solutions.

Best regards,

Lynn

Effective Methods to Resolve AWS CDK Cycling Reference Errors

What is a cycle reference issue in AWS CDK?

Subscribe to my newsletter

Lynn Nguyen

Lynn Nguyen