GRAPHAY: Building an AI Powered Invoice Automation system


It was 3 AM, and I was drowning in unread emails, desperately searching for one critical invoice. The deadline had already passed, our servers were about to get cut off, and my CTO was hovering over my shoulder.
And there I was… still copy-pasting invoice details into a spreadsheet for what felt like the hundredth time that week.
If you’ve ever dealt with invoices at a growing startup, you probably know this pain:
Invoices getting lost in endless email threads
Accidentally paying the same bill twice
Missing deadlines and racking up late fees
Manual copy-paste mistakes causing failed payments
No clear way to see where an invoice stood in the approval chain
What started as a simple “just forward invoices to finance” process had turned into a complete nightmare.
The Vision: What If Invoices Could Process Themselves?
Built a system where invoices could move smoothly from the moment they arrived in email all the way to payment confirmation - without manual work, except for the approval step.
Here’s what it can do:
Automatically pick up invoice emails, extract the data, and store documents in Google Drive
Validates and handle missing details with smart, automated follow-up emails
Let our team approve invoices directly inside Discord
Verify Ethereum transactions using Etherscan API for complete trust
Keep a full audit trail in G sheets with live status updates for every invoice
The outcome? What used to take 2–3 days now takes less than 4 hours. Data entry mistakes are gone, and we now have full visibility into our invoice process from start to finish.
Architecture Overview: The Seven Samurai Pipeline🥷
1: Email Intelligence – The Watchful Guardian
Scans unread emails on inbox (sidharth@hackgrid.com) every 5 minutes using Gmail API
Detects and picks up invoice emails automatically
2: Google Drive Integration – The Digital Filing Cabinet
Stores every invoice safely with unique IDs and folder structure
Makes retrieval lightning fast with metadata tagging
3: AI Document Processing – The Data Detective
Reads invoices and extracts key details like amount, vendor, due date
Converts unstructured documents into structured, usable data
Ensures data accuracy with validation checks
4: Google Sheets – The Central Command Center
Tracks every invoice from arrival to payment
Serves as the single source of truth for the team
Offers a clear, real-time dashboard for visibility
5: Smart Info Gathering – The Persistent Assistant
Handles missing details (e.g., vendor name , Invoice Date) automatically
Sends personalized follow-ups to vendors for clarification
6: Discord Approvals – The Collaboration Hub
Lets the team approve invoices directly inside Discord thread
This section uses Human in the loop process to trust the human
7: Blockchain Verification – The Trust Layer
Verifies Ethereum transactions securely via etherscan API
Matches transactions with vendor details to prevent errors
Returns the amount and status of the transaction in thread and updates in sheets
Final Output:
Making the System Stronger and Smarter:
1. Concurrent Processing Architecture
One crucial design decision was implementing concurrent processing for different workflow stages
Multiple invoices can be in various states simultaneously without blocking each other
2. Error Recovery and Retry Logic
Real-world systems need robust error handling. Our system implements exponential backoff and smart retry logic
By introducing robust recovery mechanisms, the system can restart or reroute tasks seamlessly, ensuring minimal disruption and no data loss.
3. Rate Limiting and API Management
Since the workflow integrates with several external APIs—such as Gmail, Google Drive, Discord, and AI services- rate limiting and API quota management are critical.
Our system incorporates adaptive request scheduling, batching where possible, and backoff strategies to ensure smooth communication with APIs.
Under the Hood: The Tech Stack
LangGraph - Orchestration backbone for chaining Gmail, Drive, AI extraction, and approval workflows into a resilient pipeline
GPT-4 - Used for embeddings, retrieval-augmented generation (RAG), and extracting structured invoice data with validation checks
Etherscan API - Verified Ethereum transactions, ensuring payments matched vendor details and were fully confirmed
Google APIs (Gmail, Drive, Sheets) - Handled email intake, secure document storage, and single source of truth tracking
Discord API - Integrated directly into our team workspace for streamlined approvals and real-time collaboration
Challenges & Solution:
Unexpected Node Failures
Problem: A crucial node would sometimes shut down without warning, breaking the entire flow.
Solution: We added detailed logs at every step. This gave us a clear trail to quickly identify and fix the failing component.
Complex System Mapping
Problem: Visualizing all nodes and connections was overwhelming and confusing.
Solution: We sketched the architecture manually in a notebook first. This simple step gave us a clearer roadmap before coding.
Approval Bottlenecks
Problem: When multiple invoices waited for approval, only one moved forward while others got stuck and never returned a response.
Solution: We improved concurrency handling, making sure parallel approval requests didn’t block each other.
Duplicate & Approved Invoice Detection
Problem: Identifying already-approved invoices relied heavily on Google Sheets, making it hard to extract the exact confirmation status.
Solution: We added smarter lookups with unique identifiers and metadata checks to reliably detect duplicates and confirmations
If you’d like to explore the full project, you can check it out here:
https://github.com/Sidharth1743/Graphay
High level Flowchart : https://tinyurl.com/fx4ps63y
Final Words:
Thanks for taking the time to read!
If you’d like to connect or share your thoughts, you can find me on X and Linkedin
Follow my journey via #100DaysofAIEngineer on X to see my daily work and progress.
Keep Building! Make your Hands Dirty!
- Sidharthan , HackGrid
Subscribe to my newsletter
Read articles from Sidharthan directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Sidharthan
Sidharthan
• A guy from a super tier-2 government college with no mentors and guides just self learning. • Passionately exploring emerging trends and innovations, with a deep interest in safe and interpretable AI inspired by Anthropic’s vision. • Documenting my journey, experiments, and insights through a "learn in public" approach and every post reflects my personal perspective and hands-on experience.