Project Aether: Building an AI-Native IaC Tool From a Secure Foundation

This series documents the journey of building "Aether," an offline-first, AI-native CLI tool that generates secure Infrastructure as Code (IaC) from natural language prompts. This week, we moved past basic setup and into the core challenge: creating a high-quality, secure dataset to train our AI model.
Phase 1 – The AI Dataset (Progress & Challenges)
Objective
The goal is to create a dataset of several hundred high-quality, "instruction-response" pairs. Each pair consists of a natural language prompt (e.g., "Create a secure S3 bucket") and a "perfect" CloudFormation YAML response. The quality of our final AI model depends entirely on the quality of this dataset.
Our core challenge was ensuring every YAML example is not just functional, but also secure and compliant with best practices.
Problem 1: Ensuring Security at Scale
Manually verifying the security of hundreds of YAML files is impractical and error-prone. We needed an automated way to enforce security best practices for every single example in our dataset.
Solution:
We integrated checkov
, an open-source static analysis tool for IaC. The new rule for our workflow became: No YAML response is considered "perfect" until it passes the checkov
scan with zero failures.
Problem 2: The Iterative Validation Cycle
Our first test was creating a secure S3 bucket for logging. This simple task immediately highlighted the importance of automated validation.
Initial Attempt:
A basic YAML file for an S3 bucket.
Error Message:
Bash
checkov -f test.yml
Check: CKV_AWS_18: "Ensure the S3 bucket has access logging enabled"
FAILED for resource: AWS::S3::Bucket.LoggingBucket
Cause:
The bucket, while private, was missing server access logging, a critical security feature for auditing.
Fix #1:
We modified the template to include a second bucket (LogDestinationBucket) and enabled logging from the primary bucket to the new one.
New Error:
Bash
checkov -f test.yml
Check: CKV_AWS_21: "Ensure the S3 bucket has versioning enabled"
FAILED for resource: AWS::S3::Bucket.LogDestinationBucket
Cause:
We had secured our primary bucket, but the new logging bucket we created was itself insecure. It was missing versioning, encryption, and public access blocks. This is a classic IaC pitfall.
Problem 3: The "Who Logs the Logger?" Dilemma
After securing the new logging bucket with all the same best practices, we faced one final, stubborn error.
Error Message:
Bash
checkov -f test.yml
Check: CKV_AWS_18: "Ensure the S3 bucket has access logging enabled"
FAILED for resource: AWS::S3::Bucket.LogDestinationBucket
Why it happened:
checkov correctly pointed out that our log destination bucket also needed logging enabled. While technically true, this would require a third bucket to store the logs for the log bucket, leading to unnecessary complexity for our dataset.
Fix (The "Perfect" Response):
This is a valid exception to the rule. We suppressed the check for this specific resource by adding a Metadata block to the CloudFormation template. This tells checkov that we have intentionally reviewed and accepted this exception.
YAML
AWSTemplateFormatVersion: '2010-09-09'
Resources:
LogDestinationBucket:
Type: AWS::S3::Bucket
Metadata: # <-- The final fix to suppress the check
checkov:
skip:
- id: 'CKV_AWS_18'
comment: 'This is the destination bucket for logs'
DeletionPolicy: Retain
Properties:
# ... other security properties
This iterative process resulted in a truly secure and validated template, ready for our dataset.
Automating the Workflow
Manually running checkov
on every file is too slow. We wrote a simple Python script to automate the validation of the entire dataset file at once.
The Script (validate_
dataset.py
):
Python
import json
import subprocess
import tempfile
import os
from rich.console import Console
console = Console()
DATASET_FILE = "data.json1" # Updated filename
def validate_dataset():
"""
Reads a JSONL dataset file, validates the YAML output of each entry
using checkov, and reports the results.
"""
if not os.path.exists(DATASET_FILE):
console.print(f"[bold red]Error:[/bold red] Dataset file '{DATASET_FILE}' not found.")
return
# ... (rest of the script) ...
This script loops through each entry, saves the YAML to a temporary file, runs checkov
, and prints a simple "PASSED" or "FAILED" report, showing the errors for any failures. This turns hours of work into seconds.
Tools Used
Python 3.10
Checkov (for automated security scanning)
CloudFormation (YAML)
Visual Studio Code /
nano
What's Working So Far
A clear project architecture for a self-contained, AI-native CLI tool.
A robust, automated workflow for creating and validating secure IaC dataset examples.
A seed dataset with several high-quality, validated entries.
Want to Follow Along?
I’ll be sharing weekly progress — issues, logs, architecture, and the AI model itself. If you've solved similar problems (like automated cloud optimization or building AI developer tools), I’d love to hear your insights.
Subscribe to my newsletter
Read articles from Sahil Gada directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
