Building a Personal and Corporate Quickbook for Tax and Expense Management Using OpenSource Tools
Managing bills and expenses can be challenging, especially when preparing for tax season or managing corporate finances. To streamline this process, I developed an open-source QuickBook tracker application that utilizes various open-source tools. This application allows users to upload bill images, which are then parsed using a custom AI/ML model to extract essential details like the supplier name, invoice number, date, tax, net amount, and category. The extracted details are stored in Notion, while the bill images are uploaded to Google Drive, providing an organized and efficient way to track and manage bills.
Key Features
Bill Image Upload: Users can upload images of bills/expenses directly through the Flask web application.
Google Drive Integration: Uploaded bill images are saved to Google Drive for easy access and backup.
Notion Database: Extracted bill details are stored in a Notion database, allowing users to track and manage their bills efficiently.
Secure and Scalable: The application is designed with security in mind, ensuring that user data is protected.
Cost-Effective: Instead of paying for expensive third-party services, this application can be built and maintained at no cost using open-source tools and APIs.
Technology Stack
Flask: A lightweight and flexible Python web framework used for building the web applications.
Google Drive API: Used to upload and manage bill images in Google Drive.
Notion API: Used to create and update entries in a Notion database with the extracted bill details.
Mindee API: An optional tool to enhance OCR capabilities, especially for more complex bill structures.
Step-by-Step Guide
- Setting Up the Flask Application:
Create a Flask application with a simple user interface for uploading bill images.
Implement secure user authentication to protect user data.
from flask import Flask, request, render_template, redirect, url_for
from werkzeug.utils import secure_filename
import os
from dotenv import load_dotenv
import base64
from utils.gdrive_helper import GoogleDriveHelper
from utils.notion_service import NotionDatabaseHandler
from utils.reciept_service import ReceiptParser
load_dotenv()
app = Flask(__name__)
credential_json = json.loads(os.getenv("GOOGLE_OAUTH_SERVICE_ACCOUNT"))
gdrive_service = GoogleDriveHelper(credential_json)
drive_folder_id = gdrive_service.create_folder("test_notion_upload")
receipt_service = ReceiptParser(os.getenv("MINDEE_API_KEY"))
notion_db_id = os.getenv("NOTION_DB_ID")
notion_service = NotionDatabaseHandler(
api_key=os.getenv("NOTION_KEY"),
database_id=notion_db_id
)
@app.route('/')
def index():
return render_template('index.html')
@app.route('/upload', methods=['POST'])
def upload_file():
if 'file' not in request.files:
return redirect(request.url)
file = request.files['file']
if file.filename == '':
return redirect(request.url)
if file:
filename = secure_filename(file.filename)
file_bytes = file.read()
file_base64 = base64.b64encode(file_bytes).decode('utf-8')
# Parse the reciept
reciept_details = receipt_service.parse_reciept_from_base64(file_base64, filename)
# Upload the file to the google drive
_, drive_url = gdrive_service.upload_file_obj(
file, filename, drive_folder_id
)
reciept_details["Receipt"] = {"name": filename, "url": drive_url}
# Create the payload from the user content
entry_payload = notion_service.create_entry_payload(reciept_details)
# Insert the entry into the database
if entry_payload:
notion_service.insert_entry(entry_payload)
flash("Image uploaded and processed successfully!", "success")
# Process the uploaded file here
return redirect(url_for('index'))
if __name__ == '__main__':
app.run(debug=True)
This code creates a simple web interface where users can upload bill images. The uploaded files are saved to a local directory for further processing.
2. Integrating AI/ML Model:
Develop a custom AI/ML model trained to parse bill receipts and extract information such as the supplier name, invoice number, amount, tax, net amount, category, and date.
Integrate the model into the Flask application to process uploaded bill images.
Alternatively, you can use Mindee to parse the invoice bill
from mindee import Client, product
class ReceiptParser:
def __init__(self, api_key):
self.mindee_client = Client(api_key=api_key)
def extract_bill_details(self, receipt_data):
output = {
"Name": receipt_data.document.inference.prediction.supplier_name.value or '',
"Invoice No": receipt_data.document.inference.prediction.receipt_number.value or '',
"Amount": receipt_data.document.inference.prediction.total_amount.value,
"Tax": receipt_data.document.inference.prediction.total_tax.value,
"Net": receipt_data.document.inference.prediction.total_net.value,
"Category": receipt_data.document.inference.prediction.category.value,
"Date": receipt_data.document.inference.prediction.date.value,
"Mode": "Credit Card",
}
print(f"Bill details: {output}")
return output
def parse_reciept_from_base64(self, base64_string, image_name):
input_doc = self.mindee_client.source_from_b64string(base64_string, image_name)
receipt_data = self.mindee_client.parse(product.ReceiptV5, input_doc)
result = self.extract_bill_details(receipt_data)
return result
Call the parse_reciept_from_base64
function in your upload handler to process each bill and extract the necessary information.
3. Uploading to Google Drive:
Use the Google Drive API to upload bill images directly from the application using the OAuth credentials.
Organize images in a structured way, such as by date or category, for easy retrieval.
class GoogleDriveHelper:
SCOPES = [
"https://www.googleapis.com/auth/drive.file",
"https://www.googleapis.com/auth/drive",
]
VALID_EXTENSIONS = [".pdf", ".png", ".jpeg", ".jpg"]
def __init__(self, credentail_json):
self.credentail_json_file = credentail_json
self.connect_using_service_account()
def connect_using_service_account(self):
creds = ServiceAccountCredentials.from_json_keyfile_dict(
self.credentail_json, self.SCOPES
)
self.drive_service = build(
"drive", "v3", credentials=creds, cache_discovery=False
)
def upload_file_obj(self, file_obj, filename, parent_folder_id=None):
mime_type, _ = mimetypes.guess_type(filename)
if mime_type is None:
mime_type = "application/octet-stream" # Default fallback MIME type
file_metadata = {
"name": filename,
"parents": [parent_folder_id] if parent_folder_id else [],
}
print(mime_type)
# Upload the file object using MediaIoBaseUpload
media = MediaIoBaseUpload(file_obj, mimetype=mime_type, resumable=True)
# Upload the file to Google Drive
drive_file = (
self.drive_service.files()
.create(body=file_metadata, media_body=media, fields="id")
.execute()
)
file_id = drive_file.get("id")
print(f"File uploaded successfully. File ID: {file_id}")
sharablelink = self.get_sharable_link(file_id)
return drive_file.get("id"), sharablelink
Integrate this function in the Flask application to upload the processed bill images directly to Google Drive.
4. Creating Notion Database Entries:
Set up a Notion database with fields for storing bill information.
Use the Notion API to create new entries in the database for each parsed bill, allowing users to view and manage their bills within Notion.
import requests
NOTION_TOKEN = 'your_notion_token'
NOTION_DATABASE_ID = 'your_database_id'
class NotionDatabaseHandler:
def __init__(self, api_key=NOTION_TOKEN, database_id=NOTION_DATABASE_ID):
self.api_key = api_key
self.database_id = database_id
self.headers = {
"Authorization": f"Bearer {self.api_key}",
"Content-Type": "application/json",
"Notion-Version": "2022-06-28", # Latest API version
}
def insert_entry(self, details):
data = {
"parent": {"database_id": self.database_id},
"properties": {
"Name": {"title": [{"text": {"content": details['supplier_name']}}]},
"Invoice Number": {"rich_text": [{"text": {"content": details['invoice_number']}}]},
"Date": {"date": {"start": details['date']}},
"Tax": {"number": details['tax']},
"Net Amount": {"number": details['net_amount']},
"Category": {"select": {"name": details['category']}}
}
}
response = requests.post('https://api.notion.com/v1/pages', headers=self.headers, json=data)
if response.status_code == 200:
print('Notion entry created successfully')
else:
print(f'Failed to create Notion entry: {response.content}')
This function sends a POST request to the Notion API to create a new entry with the parsed bill details.
5. Viewing and Tracking Bills:
- Provide a user-friendly dashboard within the Flask application where users can view and manage their bills. Use data from the Notion database to populate the dashboard, allowing for easy tracking and filtering by date, category, or amount.
Benefits
Time-Saving: Automates the manual task of tracking and organizing bills, freeing up time for other tasks.
Accuracy: Reduces errors by using AI/ML models to extract bill details reliably.
Centralized Management: Stores all bill information in a single place, making it easy to track and manage finances.
Open Source and Cost-Free: Built using open-source tools, this application eliminates the need for expensive third-party services, allowing you to manage your bills at no cost.
Tax Readiness: By categorizing and organizing expenses, the application helps users stay prepared for tax season, simplifying the process of compiling deductible expenses.
Customizable: Easily extend and modify the application to fit your specific needs or add new features.
Conclusion
This open-source bill tracker application offers a comprehensive solution for managing bills, especially for tax and expense management. By utilizing Flask, Mindee, Google Drive, and Notion, you can create a powerful tool to keep your finances organized — all at no cost. Whether you’re managing personal finances or handling corporate bills, this application simplifies the process and eliminates the need for expensive third-party services.
For those interested in exploring the full code or contributing to the project, please feel free to reach out to me at girishcodealchemy@gmail.com*. I’m excited to share the complete implementation and collaborate on further improvements!*
🚀 Stay tuned for more powerful insights into open-source tools for seamless expense and tax management. Happy organizing!!! 🌐
✈️ Github: github.com/GirishCodeAlchemy
✈️Linkedin:linkedin.com/in/vgirish10
Subscribe to my newsletter
Read articles from Girish V directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by