Building Chat History Explorer: From Late-Night Idea to Local AI-Powered WhatsApp Analyzer

April 2025 – This project began the same way most of my side projects do: with a .zip
file, a bit of curiosity, and an urge to see what would happen if I could analyze my own WhatsApp chats.
At first, the idea was simple—count some messages, maybe chart when conversations peaked. But the more I thought about it, the more I realized: this could be so much more.
What if I could generate stories out of chats?
What if I could visualize ghost gaps, or recreate my chaotic message threads as memes?
What if I could build a memory timeline, all running entirely offline?
So, I built Chat History Explorer—a privacy-first, locally hosted app that turns exported WhatsApp chats into analytics, timelines, and even creative outputs like zines and AI-written stories.
The Starting Point: What Did I Want?
I wasn’t aiming for another sterile dashboard.
I wanted something that was:
- Local & Secure – All data stays on my machine.
- Flexible – Local LLMs like Ollama for all AI tasks.
- Playful – From Shakespearean rewrites to timeline-based storybook chapters.
- Personal – Less enterprise SaaS, more scrapbook with vibes.
I sketched the vision, wrote down the wildest features I could think of, and then got to work building the foundation.
The Core I Built First
1. One-Click Upload + Animated Feedback
- I built a Home Page with a big “Upload WhatsApp Chat” button.
- It accepts a
.zip
file with:- One
chat.txt
file (the exported WhatsApp text log). - Media files in a flat folder.
- One
- As soon as I uploaded a file, a little animation kicked in:
Uploading → Processing Data → Complete
- When it was done, a clear message appeared:
"Your data is ready for analysis!"
2. Parsing and Enrichment
The parser had to be smart. WhatsApp export formats can vary—some use 12-hour clocks, others 24. Some are DD/MM/YYYY, others MM/DD/YYYY. And multiline messages? Always messy.
For each message, I extracted:
timestamp
,sender
,text
has_media
(based on<Media omitted>
or filename match)emoji_count
,word_count
,char_count
hour
,weekday
, and adaypart
label (morning, afternoon, evening, night)- A flag if it marked a gap (>2 days between messages)
- Another if it looked like a reply
Everything was stored in messages.db
, powered by SQLite.
3. Chunking + Insight Files
Next, I divided the chats into time-based chunks—monthly by default, weekly if the volume was high.
For each chunk, I computed:
- Total messages
- Average message length
- Emoji frequency
- Nighttime message ratio (10 PM – 3 AM)
- Sentiment score (via VADER/TextBlob)
- A tone summary using Ollama (locally)
I also assigned each chunk a phase label, like:
"Hi." → “The Emoji Era” → “Ghost Gap #1” → “Memes That Saved Us”
All of that went into a chunk_insights.jsonl
file.
4. Global Metadata
At the full-chat level, I calculated:
- Total messages/media
- Start/end dates
- Longest silence
- Most active day/hour
- Emoji usage ratio
- First and last messages
- Participants
Stored in global_metadata.json
, this sets the stage for overview cards, filters, and story mode navigation later.
Adding Embeddings with FAISS
Since some chats can hit 100K+ messages, I needed a way to retrieve high-context segments efficiently.
So I:
- Embedded each chunk summary using a local model
- Stored it in a local FAISS index
This enabled smarter context-aware insights within the app—all without touching the internet.
State Management + Notifications
- I implemented a backend state machine that tracks:
Uploading → Processing Data → Complete
- A simple
GET /status
endpoint exposes the current state - The frontend polls this and reflects it in real time, both on the Home Page and a dedicated Status Page
No guesswork. The user always knows what’s happening.
Frontend Integration (No UI Tweaks)
I built a basic react frontend with these pages:
- Home Page – Upload button, animated status
- Chats Page – Lists uploaded chats with metadata
- Settings Page – Lets me configure Ollama as the AI engine
- Status Page – Shows real-time updates
- How To Use Page – Fully documented, no placeholders
- About Page – Brief overview + GitHub link
Everything talks to the backend, cleanly and in sync.
What Comes Next?
With ingestion, parsing, enrichment, and storage in place, I now have a powerful base layer to build on. Next features on my radar:
- Chat analytics: Heatmaps, ghost-gap graphs, keyword trends
- Storybook mode: Interactive chapter-based timeline
- Generative tools:
- Short stories and zines
- Meme captions
- Chatbot simulation in my tone
- Panic archives for motivational posters
Because I’ve stored everything properly—once—I can build all of these without reprocessing a thing.
Why I’m Building It This Way
Most people analyze chats in dashboards.
I want to remember, laugh, and turn those messages into art.
I didn’t just want metrics. I wanted a timeline of shared life, the weird stuff, the quiet patches, the comeback phases, the passive-aggressive emojis.
All offline. All mine.
If you're building something similar—or want to try this out—I’d love feedback or contributions. I’m documenting every piece along the way.
Subscribe to my newsletter
Read articles from Kamallraj directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
