Automated Receipt 🧾Organiser - AWS☁️ and AI🤖

About the Project
the project helps the user to save receipts within their email using AWS tools; this helps the user to save time and organise the receipts so they dont need to keep looking for physical copies when needed. 💡
Services Used
Amazon S3: Stores uploaded receipt images and PDFs. [Storage]
Amazon Textract: Extracts text and structured data from scanned receipts. [AI/ML]
Amazon DynamoDB: Stores extracted receipt data in a structured format. [Database]
Amazon SES: Sends email notifications with extracted receipt details. [Messaging]
AWS Lambda: Automates the processing workflow for real-time execution. [Compute]
IAM Roles & Policies: Ensures secure access between services. [Security]
Architectural Diagram 📝
Automated AWS Receipt Processing System
The architecture consists of 5 layers :
Storage Layer: Amazon S3 stores receipt images and PDFs.
Processing Layer: Amazon Textract extracts text from receipts using AI-powered OCR.
Database Layer: DynamoDB stores the extracted data in a structured format.
Notification System: Amazon SES sends email alerts with receipt details.
Compute Layer: AWS Lambda automates the workflow by processing the receipts in real-time.
Time ⌛ and Cost 💸
Approximate 2 hours using AWS Free Tier services
Step-by-Step Guide
Sign in to AWS Console
Navigate to Amazon S3 and click on Create Bucket.
Ensure the bucket name is unique, but keep other settings as default.
Click on the newly created S3 bucket and create a folder named "incoming" for uploading receipts.
Set Up DynamoDB
Navigate to DynamoDB and click on Create Table.
Enter "receipts" as the Table Name.
Set "receipt_id" as the partition key with type "string".
Set "date" as the sort key with type "string" to sort receipts by date.
Keep other settings default and click Create Table.
Configure Amazon SES
Navigate to Amazon SES and go to Configuration > Identities.
Click on Create Identity and choose the Email Address option.
Enter the email address to be used for sending receipts and click Create Identity.
Verify your email address through the verification email sent by AWS.
Create IAM Role
Navigate to IAM and click on Roles.
Create a role using AWS Service as the trusted entity type and Lambda as the use case.
On the permissions policies page, select the following policies:
AmazonS3ReadOnlyAccess
AmazonTextractFullAccess
AmazonDynamoDBFullAccess
AmazonSESFullAccess
AWSLambdaBasicExecutionRole
Name the role "ReceiptProcessingLambdaRole" and click Create Role.
Create Lambda Function
Click on Create a Function and select Author from scratch.
Name the function "Receipt" (or any preferred name).
Choose Python 3.9 as the runtime.
Select the existing role "ReceiptProcessingLambdaRole".
Change the timeout to 3 minutes in the configuration settings.
Add the following environment variables:
Key | Value |
DYNAMODB_TABLE | Receipts |
SES_SENDER_EMAIL | “email that you used for IAM that you verified” |
SES_RECIPIENT_EMAIL | “same email as sender” |
- Add Lambda Code
- Replace the default code with the provided code for processing receipts.
Code👩💻
import json
import os
import boto3
import uuid
from datetime import datetime
import urllib.parse
# Initialize AWS clients
s3 = boto3.client('s3')
textract = boto3.client('textract')
dynamodb = boto3.resource('dynamodb')
ses = boto3.client('ses')
# Environment variables
DYNAMODB_TABLE = os.environ.get('DYNAMODB_TABLE', 'Receipts')
SES_SENDER_EMAIL = os.environ.get('SES_SENDER_EMAIL', 'your-email@example.com')
SES_RECIPIENT_EMAIL = os.environ.get('SES_RECIPIENT_EMAIL', 'recipient@example.com')
def lambda_handler(event, context):
try:
# Get the S3 bucket and key from the event
bucket = event['Records'][0]['s3']['bucket']['name']
# URL decode the key to handle spaces and special characters
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'])
print(f"Processing receipt from {bucket}/{key}")
# Verify the object exists before proceeding
try:
s3.head_object(Bucket=bucket, Key=key)
print(f"Object verification successful: {bucket}/{key}")
except Exception as e:
print(f"Object verification failed: {str(e)}")
raise Exception(f"Unable to access object {key} in bucket {bucket}: {str(e)}")
# Step 1: Process receipt with Textract
receipt_data = process_receipt_with_textract(bucket, key)
# Step 2: Store results in DynamoDB
store_receipt_in_dynamodb(receipt_data, bucket, key)
# Step 3: Send email notification
send_email_notification(receipt_data)
return {
'statusCode': 200,
'body': json.dumps('Receipt processed successfully!')
}
except Exception as e:
print(f"Error processing receipt: {str(e)}")
return {
'statusCode': 500,
'body': json.dumps(f'Error: {str(e)}')
}
def process_receipt_with_textract(bucket, key):
"""Process receipt using Textract's AnalyzeExpense operation"""
try:
print(f"Calling Textract analyze_expense for {bucket}/{key}")
response = textract.analyze_expense(
Document={
'S3Object': {
'Bucket': bucket,
'Name': key
}
}
)
print("Textract analyze_expense call successful")
except Exception as e:
print(f"Textract analyze_expense call failed: {str(e)}")
raise
# Generate a unique ID for this receipt
receipt_id = str(uuid.uuid4())
# Initialize receipt data dictionary
receipt_data = {
'receipt_id': receipt_id,
'date': datetime.now().strftime('%Y-%m-%d'), # Default date
'vendor': 'Unknown',
'total': '0.00',
'items': [],
's3_path': f"s3://{bucket}/{key}"
}
# Extract data from Textract response
if 'ExpenseDocuments' in response and response['ExpenseDocuments']:
expense_doc = response['ExpenseDocuments'][0]
# Process summary fields (TOTAL, DATE, VENDOR)
if 'SummaryFields' in expense_doc:
for field in expense_doc['SummaryFields']:
field_type = field.get('Type', {}).get('Text', '')
value = field.get('ValueDetection', {}).get('Text', '')
if field_type == 'TOTAL':
receipt_data['total'] = value
elif field_type == 'INVOICE_RECEIPT_DATE':
# Try to parse and format the date
try:
receipt_data['date'] = value
except:
# Keep the default date if parsing fails
pass
elif field_type == 'VENDOR_NAME':
receipt_data['vendor'] = value
# Process line items
if 'LineItemGroups' in expense_doc:
for group in expense_doc['LineItemGroups']:
if 'LineItems' in group:
for line_item in group['LineItems']:
item = {}
for field in line_item.get('LineItemExpenseFields', []):
field_type = field.get('Type', {}).get('Text', '')
value = field.get('ValueDetection', {}).get('Text', '')
if field_type == 'ITEM':
item['name'] = value
elif field_type == 'PRICE':
item['price'] = value
elif field_type == 'QUANTITY':
item['quantity'] = value
# Add to items list if we have a name
if 'name' in item:
receipt_data['items'].append(item)
print(f"Extracted receipt data: {json.dumps(receipt_data)}")
return receipt_data
def store_receipt_in_dynamodb(receipt_data, bucket, key):
"""Store the extracted receipt data in DynamoDB"""
try:
table = dynamodb.Table(DYNAMODB_TABLE)
# Convert items to a format DynamoDB can store
items_for_db = []
for item in receipt_data['items']:
items_for_db.append({
'name': item.get('name', 'Unknown Item'),
'price': item.get('price', '0.00'),
'quantity': item.get('quantity', '1')
})
# Create item to insert
db_item = {
'receipt_id': receipt_data['receipt_id'],
'date': receipt_data['date'],
'vendor': receipt_data['vendor'],
'total': receipt_data['total'],
'items': items_for_db,
's3_path': receipt_data['s3_path'],
'processed_timestamp': datetime.now().isoformat()
}
# Insert into DynamoDB
table.put_item(Item=db_item)
print(f"Receipt data stored in DynamoDB: {receipt_data['receipt_id']}")
except Exception as e:
print(f"Error storing data in DynamoDB: {str(e)}")
raise
def send_email_notification(receipt_data):
"""Send an email notification with receipt details"""
try:
# Format items for email
items_html = ""
for item in receipt_data['items']:
name = item.get('name', 'Unknown Item')
price = item.get('price', 'N/A')
quantity = item.get('quantity', '1')
items_html += f"<li>{name} - ${price} x {quantity}</li>"
if not items_html:
items_html = "<li>No items detected</li>"
# Create email body
html_body = f"""
<html>
<body>
<h2>Receipt Processing Notification</h2>
<p><strong>Receipt ID:</strong> {receipt_data['receipt_id']}</p>
<p><strong>Vendor:</strong> {receipt_data['vendor']}</p>
<p><strong>Date:</strong> {receipt_data['date']}</p>
<p><strong>Total Amount:</strong> ${receipt_data['total']}</p>
<p><strong>S3 Location:</strong> {receipt_data['s3_path']}</p>
<h3>Items:</h3>
<ul>
{items_html}
</ul>
<p>The receipt has been processed and stored in DynamoDB.</p>
</body>
</html>
"""
# Send email using SES
ses.send_email(
Source=SES_SENDER_EMAIL,
Destination={
'ToAddresses': [SES_RECIPIENT_EMAIL]
},
Message={
'Subject': {
'Data': f"Receipt Processed: {receipt_data['vendor']} - ${receipt_data['total']}"
},
'Body': {
'Html': {
'Data': html_body
}
}
}
)
print(f"Email notification sent to {SES_RECIPIENT_EMAIL}")
except Exception as e:
print(f"Error sending email notification: {str(e)}")
# Continue execution even if email fails
print("Continuing execution despite email error")
- Set Up S3 Event Notification
Go back to the S3 bucket and navigate to the Properties tab.
Scroll to Event Notifications and click Create Event Notification.
Enter "ReceiptUpload" as the name and "incoming/" as the prefix.
Select All object create events as the event type.
Choose Lambda function as the destination and select ReceiptProcessor from the dropdown.
Save the changes.
- Testing
Upload a receipt to the incoming folder in the S3 bucket. Here is an example you can use ⬇️
-
Check your email for a notification (it might be in the spam folder).
By following these steps, you will have a fully functional automated receipt processing system using AWS services.
the email should look something like this ⬇️
you can also visit the lambda function and click on monitor to see if the function works
the dots in the graph below represents the activity
Cleanup 🗑️
- Delete S3 Bucket: Remove all uploaded receipt files and then delete the bucket.
2. Stop Textract Processing: Ensure no further API calls are made to prevent extra costs.
3. Delete DynamoDB Table: Remove stored receipt data and then delete the table.
4. Disable SES Notifications: If SES was configured, remove verified email addresses.
5. Remove IAM Roles and Policies: Delete the IAM role created for the Lambda function.
✨Inspiration✨
This Article is inspired by Tech with Lucy's Build With Me Videos from Youtube
Feel free to leave me a comment if you face any issues during the project. 🥰
Subscribe to my newsletter
Read articles from Savi directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Savi
Savi
Hey there! I’m Savi, a cloud enthusiast, tech builder, and all-around problem-solver based in London. 💻☁️ By day, I’m diving into AWS, Azure, and everything cloud-related, turning ideas into scalable, secure, and impactful projects. By night, I’m juggling French lessons (B2 goals 💬), hitting the gym, and whipping up kitchen experiments. 🏋️♀️🍳 I believe in learning by doing, so you’ll find this blog packed with cool projects, tech hacks, and lessons from my journey in the world of cloud computing. Stick around—let’s build, learn, and grow together. 🚀