Making an AWS Lambda Function Testable Locally: CSV to RDS in Python


While building a serverless pipeline to transform .csv
files and load them into a PostgreSQL RDS database on AWS, I reached a point where I wanted to write a local test for the Lambda function that handles the transformation.
The original Lambda handler was tightly coupled with AWS services like S3 and Secrets Manager. To make the code testable, I decided to refactor the logic into smaller, isolated functions. Here's a breakdown of how I approached it.
Original Lambda Handler: One Big Function
Initially, my handler.py
file did everything:
pulled a file from S3
fetched secrets from AWS Secrets Manager
connected to RDS
parsed the CSV
inserted records into a database
This made it hard to test locally, since you needed a full AWS environment for every part to run.
β Step 1: Separate Business Logic from AWS Glue
I identified the core logic that could be tested independently:
π parsing and transforming a CSV file and inserting it into a database.
I extracted this logic into a function called process_csv_file(csv_content, db_config)
.
def process_csv_file(csv_content, db_config):
print("π Connecting to database...")
conn = psycopg2.connect(**db_config)
cursor = conn.cursor()
csv_reader = csv.reader(StringIO(csv_content), delimiter=";")
headers = next(csv_reader)
print(f"π§Ύ CSV headers: {headers}")
insert_sql = """
INSERT INTO transactions (
transaction_date, booking_date, reject_date,
amount, currency, sender_receiver, description,
product, transaction_type, order_amount, order_currency,
status, balance_after
) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
"""
row_count = 0
for row in csv_reader:
parsed_row = parse_row(row)
try:
cursor.execute(insert_sql, parsed_row)
row_count += 1
if row_count % 50 == 0:
print(f"π Inserted {row_count} rows so far...")
except Exception as e:
print(f"β Failed to insert row {row}: {e}")
conn.commit()
cursor.close()
conn.close()
print(f"β
Finished. Inserted {row_count} rows into RDS.")
In the main method, you can see that handler.py
contains all the AWS-related code, which I don't want to test.
β Step 2: Write a Local Test Runner
Next, I created a simple Python script to run this function locally with a test .csv
file and local PostgreSQL instance:
# local_test.py
import os
from handler import process_csv_file
# Replace with your local DB credentials
db_config = {
"host": "localhost",
"port": "5432",
"dbname": "budget",
"user": "budgetadmin",
"password": "JvJWGgkmT5BnDj4El67H"
}
def local_main():
with open("test.csv", encoding="utf-8") as f:
csv_content = f.read()
process_csv_file(csv_content, db_config)
if __name__ == "__main__":
local_main()
This allowed me to quickly test changes to CSV parsing or SQL logic without redeploying the Lambda.
I could still extend this test to make it more like a unit test. Currently, it is more of an integration test since I am using a local database. However, I am pleased with the results I have achieved.
Subscribe to my newsletter
Read articles from Jakub Sokolowski directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
