Making an AWS Lambda Function Testable Locally: CSV to RDS in Python

While building a serverless pipeline to transform .csv files and load them into a PostgreSQL RDS database on AWS, I reached a point where I wanted to write a local test for the Lambda function that handles the transformation.

The original Lambda handler was tightly coupled with AWS services like S3 and Secrets Manager. To make the code testable, I decided to refactor the logic into smaller, isolated functions. Here's a breakdown of how I approached it.

Original Lambda Handler: One Big Function

Initially, my handler.py file did everything:

  • pulled a file from S3

  • fetched secrets from AWS Secrets Manager

  • connected to RDS

  • parsed the CSV

  • inserted records into a database

This made it hard to test locally, since you needed a full AWS environment for every part to run.

βœ… Step 1: Separate Business Logic from AWS Glue

I identified the core logic that could be tested independently:
πŸ‘‰ parsing and transforming a CSV file and inserting it into a database.

I extracted this logic into a function called process_csv_file(csv_content, db_config).

def process_csv_file(csv_content, db_config):
    print("πŸš€ Connecting to database...")
    conn = psycopg2.connect(**db_config)
    cursor = conn.cursor()

    csv_reader = csv.reader(StringIO(csv_content), delimiter=";")
    headers = next(csv_reader)
    print(f"🧾 CSV headers: {headers}")

    insert_sql = """
        INSERT INTO transactions (
            transaction_date, booking_date, reject_date,
            amount, currency, sender_receiver, description,
            product, transaction_type, order_amount, order_currency,
            status, balance_after
        ) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
    """

    row_count = 0
    for row in csv_reader:
        parsed_row = parse_row(row)
        try:
            cursor.execute(insert_sql, parsed_row)
            row_count += 1
            if row_count % 50 == 0:
                print(f"πŸ“Š Inserted {row_count} rows so far...")
        except Exception as e:
            print(f"❌ Failed to insert row {row}: {e}")

    conn.commit()
    cursor.close()
    conn.close()
    print(f"βœ… Finished. Inserted {row_count} rows into RDS.")

In the main method, you can see that handler.py contains all the AWS-related code, which I don't want to test.

βœ…Step 2: Write a Local Test Runner

Next, I created a simple Python script to run this function locally with a test .csv file and local PostgreSQL instance:

# local_test.py

import os
from handler import process_csv_file

# Replace with your local DB credentials
db_config = {
    "host": "localhost",
    "port": "5432",
    "dbname": "budget",
    "user": "budgetadmin",
    "password": "JvJWGgkmT5BnDj4El67H"
}

def local_main():
    with open("test.csv", encoding="utf-8") as f:
        csv_content = f.read()
        process_csv_file(csv_content, db_config)

if __name__ == "__main__":
    local_main()

This allowed me to quickly test changes to CSV parsing or SQL logic without redeploying the Lambda.

I could still extend this test to make it more like a unit test. Currently, it is more of an integration test since I am using a local database. However, I am pleased with the results I have achieved.

0
Subscribe to my newsletter

Read articles from Jakub Sokolowski directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Jakub Sokolowski
Jakub Sokolowski