💸 Building a Personal Expense Tracker with Python and SQLite

Managing personal finances can quickly spiral into a cluttered mess of CSV exports and disorganized spreadsheets. I wanted a workflow that was repeatable, scriptable, and easy to visualize. So I built a simple CLI in Python to help categorize and analyze my expenses using SQLite and Superset.
This post walks through how the system is designed, how to use the CLI to tag expenses, and how the data flows into a unified view for analysis.
📂 System Overview
All financial data is imported and processed into a local SQLite database. The schema is intentionally simple and centered around these components:
📋 Tables and Views
cleaned_ally view
A deduplicated and formatted version of Ally transaction exports. The view casts Amount as a decimal and removes leading spaces from CSV headers:
CREATE VIEW cleaned_ally AS
SELECT
Date,
CASE
WHEN tp = 'Withdrawal' THEN 'Spend'
ELSE 'Recieve'
END AS Type,
" Description" AS Description,
CAST(" Amount" AS decimal) AS Amount,
NULL AS Category
FROM (
SELECT DISTINCT
Date,
" Type" AS tp,
" Description",
" Amount"
FROM ally
);
cleaned_amex view
This view parses dates, handles withdrawal/deposit classification, and is ready for categorization.
CREATE VIEW cleaned_amex AS
SELECT
o.Date,
o.Description,
o.Amount,
c.category,
o.Type
FROM
(
SELECT
substr(Date,
7)|| "-" || substr(Date,
1,
2)|| "-" || substr(Date,
4,
2) Date,
Description,
CASE
WHEN CAST(Amount AS decimal) > 0 THEN "Spend"
ELSE "Recieve"
END as Type,
Amount
FROM
(
SELECT
DISTINCT Date,
Description,
Amount
FROM
amex)) o
LEFT JOIN category c ON
o.Description LIKE c.description;
category
This table maps a transaction description to a category. It acts as a reference for labeling transactions in downstream views.
CREATE VIEW all_expenses as
SELECT
Date,
Description,
Amount,
Category,
Type
FROM
cleaned_ally ca
UNION ALL
SELECT
Date,
Description,
Amount,
Category,
Type
FROM
cleaned_amex ca2
UNION ALL
SELECT
Date,
Description,
Amount,
Category,
Type
FROM
cleaned_venmo cv;
🧠 Categorizing Transactions via CLI
Instead of manually updating spreadsheets, you can run the following Python script to interactively categorize new transactions:
import sqlite3
import os
import sys
def ask_user_input(row):
print(f"Description: {row}")
user_input = input("What is the category: ")
user_confirmation = input(f"You entered: {user_input}, is this right? (y/n)")
cleaned_user_confirmation = user_confirmation.strip()
if user_confirmation.lower() == "y":
return True, user_input
else:
print("You made a mistake, skipping to next row")
return False, None
def add_category(cursor, description, category):
cursor.execute('INSERT INTO category (description, category) VALUES (?,?)', (description, category))
return
def get_categories(cursor):
cursor.execute('SELECT distinct category FROM category ORDER BY 1 ASC')
# Fetch all rows from the querya
rows = cursor.fetchall()
# Print the rows
for row in rows:
print(row[0])
return
if __name__ == "__main__":
# Connect to the SQLite database
db_path = sys.argv[1]
if not os.path.isfile(db_path):
raise Exception('Database File not Found')
with sqlite3.connect(db_path) as conn:
cursor = conn.cursor()
while True:
cursor.execute(
'''
SELECT
*
FROM
all_expenses
where
category is NULL
ORDER BY
Date DESC
LIMIT 1
'''
)
# Fetch and print each row one by one
row = cursor.fetchone()
if not row:
print("No records to categorize")
break
os.system('clear')
print(get_categories(cursor))
print("------------------------------------------------------------------------------")
success, category = ask_user_input(row)
if success:
add_category(cursor, row[1], category)
conn.commit()
It works like this:
The CLI pulls the next uncategorized transaction.
It displays the description and asks for a category.
Upon confirmation, the description-category pair is inserted into the category table.
🧩 How Categories Are Applied to Expenses
All cleaned transactions from Amex, Ally, Venmo, and other sources are brought together through a unified view called all_expenses.
Each cleaned view (e.g., cleaned_amex, cleaned_ally) contains a Description field. These descriptions are joined with the category table to populate the Category field using SQL logic similar to:
SELECT
t.Date,
t.Description,
t.Amount,
t.Type,
c.category AS Category
FROM
cleaned_amex t
LEFT JOIN
category c ON t.Description = c.description
This join is what powers the categorization workflow. Once a user categorizes a transaction using the CLI, that same category will automatically show up the next time a transaction with the same description appears.
By centralizing categorization through a CLI and joining it via SQL, I’ve effectively built a self-reinforcing system that improves over time — the more I use it, the smarter and faster it gets.
📊 Visualization with Datasette
To explore and analyze transactions, I use Datasette, a lightweight tool that turns SQLite databases into browsable web interfaces with built-in SQL querying and CSV/JSON exports.
datasette personal_finances.sqlite
This opens a local web server (usually at http://127.0.0.1:8001) where you can:
Browse tables and views (all_expenses, category, etc.)
Write SQL queries directly in the browser
Export results as CSV, JSON, or Markdown
Share a read-only view of your finances (useful for accountability partners)
Datasette is perfect for fast, no-frills querying and visualization, especially for a small, local database.
🔁 Data Cleaning Utility
Inconsistent labels (like “Groceries “ vs “groceries”) can cause duplicate categories. I use a small script, clean_db.py
, to strip white space and normalize the category names:
import sqlite3
import os
def clean_category(category):
cleaned_category = category.strip()
if not cleaned_category:
cleaned_category = "not tracked"
return cleaned_category
def update_row(cursor, category, cleaned_category):
query = f"UPDATE category SET category = '{cleaned_category}' WHERE category = '{category}'"
cursor.execute(query)
return
if __name__ == "__main__":
db_path = 'personal_finances.sqlite'
if os.path.isfile(db_path):
with sqlite3.connect(db_path) as conn:
cursor = conn.cursor()
# Execute a query to select all rows from a table
cursor.execute('SELECT * FROM category')
# Fetch and print each row one by one
for row in cursor.fetchall():
description = row[0]
category = row[1]
print(f"cleaning category: {category}")
cleaned_cat = clean_category(category)
update_row(cursor, category, cleaned_cat)
else:
print("File does not exist.")
✅ Final Thoughts
This small, scriptable pipeline gave me more transparency into my finances than any budgeting app. It’s portable, private, and fits seamlessly into my workflow. Most importantly, it grows smarter with each use as new categories are added and reused automatically.
If you’re a technical user who wants control over their spending analysis, I highly recommend rolling your own system like this.
Subscribe to my newsletter
Read articles from Nikhil Rao directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Nikhil Rao
Nikhil Rao
Los Angeles