Managing personal finances can quickly spiral into a cluttered mess of CSV exports and disorganized spreadsheets. I wanted a workflow that was repeatable, scriptable, and easy to visualize. So I built a simple CLI in Python to help categorize and analyze my expenses using SQLite and Superset.

This post walks through how the system is designed, how to use the CLI to tag expenses, and how the data flows into a unified view for analysis.

📂 System Overview

All financial data is imported and processed into a local SQLite database. The schema is intentionally simple and centered around these components:

📋 Tables and Views

cleaned_ally view

A deduplicated and formatted version of Ally transaction exports. The view casts Amount as a decimal and removes leading spaces from CSV headers:

CREATE VIEW cleaned_ally AS 
SELECT
    Date,
    CASE
        WHEN tp = 'Withdrawal' THEN 'Spend'
        ELSE 'Recieve'
    END AS Type,
    " Description" AS Description,
    CAST(" Amount" AS decimal) AS Amount,
    NULL AS Category
FROM (
    SELECT DISTINCT
        Date,
        " Type" AS tp,
        " Description",
        " Amount"
    FROM ally
);

cleaned_amex view

This view parses dates, handles withdrawal/deposit classification, and is ready for categorization.

CREATE VIEW cleaned_amex AS
SELECT
    o.Date,
    o.Description,
    o.Amount,
    c.category,
    o.Type
FROM
    (
    SELECT
        substr(Date,
    7)|| "-" || substr(Date,
    1,
    2)|| "-" || substr(Date,
    4,
    2) Date,
        Description,
        CASE
            WHEN CAST(Amount AS decimal) > 0 THEN "Spend"
            ELSE "Recieve"
        END as Type,
        Amount
    FROM
        (
        SELECT
            DISTINCT Date,
            Description,
            Amount
        FROM
            amex)) o
LEFT JOIN category c ON
    o.Description LIKE c.description;

🧠 Categorizing Transactions via CLI

Instead of manually updating spreadsheets, you can run the following Python script to interactively categorize new transactions:

import sqlite3
import os
import sys

def ask_user_input(row):
    print(f"Description: {row}")
    user_input = input("What is the category: ")
    user_confirmation = input(f"You entered: {user_input}, is this right? (y/n)")
    cleaned_user_confirmation = user_confirmation.strip()
    if user_confirmation.lower() == "y":
        return True, user_input
    else:
        print("You made a mistake, skipping to next row")
        return False, None

def add_category(cursor, description, category):
    cursor.execute('INSERT INTO category (description, category) VALUES (?,?)', (description, category))
    return

def get_categories(cursor):
    cursor.execute('SELECT distinct category FROM category ORDER BY 1 ASC')
    # Fetch all rows from the querya
    rows = cursor.fetchall()

    # Print the rows
    for row in rows:
        print(row[0])
    return

if __name__ == "__main__":
    # Connect to the SQLite database
    db_path = sys.argv[1]
    if not os.path.isfile(db_path):
        raise Exception('Database File not Found')

    with sqlite3.connect(db_path) as conn:
        cursor = conn.cursor()
        while True:
            cursor.execute(
                '''
                SELECT
                    *
                FROM
                    all_expenses
                where
                    category is NULL
                ORDER BY
                    Date DESC
                LIMIT 1
                '''
            )
            # Fetch and print each row one by one
            row = cursor.fetchone()
            if not row:
                print("No records to categorize")
                break
            os.system('clear')
            print(get_categories(cursor))
            print("------------------------------------------------------------------------------")
            success, category = ask_user_input(row)
            if success:
                add_category(cursor, row[1], category)
                conn.commit()

It works like this:

The CLI pulls the next uncategorized transaction.
It displays the description and asks for a category.
Upon confirmation, the description-category pair is inserted into the category table.

🧩 How Categories Are Applied to Expenses

All cleaned transactions from Amex, Ally, Venmo, and other sources are brought together through a unified view called all_expenses.

Each cleaned view (e.g., cleaned_amex, cleaned_ally) contains a Description field. These descriptions are joined with the category table to populate the Category field using SQL logic similar to:

SELECT
    t.Date,
    t.Description,
    t.Amount,
    t.Type,
    c.category AS Category
FROM
    cleaned_amex t
LEFT JOIN
    category c ON t.Description = c.description

This join is what powers the categorization workflow. Once a user categorizes a transaction using the CLI, that same category will automatically show up the next time a transaction with the same description appears.

By centralizing categorization through a CLI and joining it via SQL, I’ve effectively built a self-reinforcing system that improves over time — the more I use it, the smarter and faster it gets.

📊 Visualization with Datasette

To explore and analyze transactions, I use Datasette, a lightweight tool that turns SQLite databases into browsable web interfaces with built-in SQL querying and CSV/JSON exports.

datasette personal_finances.sqlite

This opens a local web server (usually at http://127.0.0.1:8001) where you can:

Browse tables and views (all_expenses, category, etc.)
Write SQL queries directly in the browser
Export results as CSV, JSON, or Markdown
Share a read-only view of your finances (useful for accountability partners)

Datasette is perfect for fast, no-frills querying and visualization, especially for a small, local database.

🔁 Data Cleaning Utility

Inconsistent labels (like “Groceries “ vs “groceries”) can cause duplicate categories. I use a small script, clean_db.py, to strip white space and normalize the category names:

import sqlite3
import os


def clean_category(category):
    cleaned_category = category.strip()
    if not cleaned_category:
        cleaned_category = "not tracked"
    return cleaned_category

def update_row(cursor, category, cleaned_category):
    query = f"UPDATE category SET category = '{cleaned_category}' WHERE category = '{category}'"
    cursor.execute(query)
    return

if __name__ == "__main__":
    db_path = 'personal_finances.sqlite'
    if os.path.isfile(db_path):
        with sqlite3.connect(db_path) as conn:
            cursor = conn.cursor()

            # Execute a query to select all rows from a table
            cursor.execute('SELECT * FROM category')

            # Fetch and print each row one by one
            for row in cursor.fetchall():
                description = row[0]
                category = row[1]
                print(f"cleaning category: {category}")
                cleaned_cat = clean_category(category)
                update_row(cursor, category, cleaned_cat)
    else:
        print("File does not exist.")

✅ Final Thoughts

This small, scriptable pipeline gave me more transparency into my finances than any budgeting app. It’s portable, private, and fits seamlessly into my workflow. Most importantly, it grows smarter with each use as new categories are added and reused automatically.

If you’re a technical user who wants control over their spending analysis, I highly recommend rolling your own system like this.

💸 Building a Personal Expense Tracker with Python and SQLite

Table of contents