April 2025 – This project began the same way most of my side projects do: with a .zip file, a bit of curiosity, and an urge to see what would happen if I could analyze my own WhatsApp chats.

At first, the idea was simple—count some messages, maybe chart when conversations peaked. But the more I thought about it, the more I realized: this could be so much more.

What if I could generate stories out of chats?
What if I could visualize ghost gaps, or recreate my chaotic message threads as memes?
What if I could build a memory timeline, all running entirely offline?

So, I built Chat History Explorer—a privacy-first, locally hosted app that turns exported WhatsApp chats into analytics, timelines, and even creative outputs like zines and AI-written stories.

The Starting Point: What Did I Want?

I wasn’t aiming for another sterile dashboard.

I wanted something that was:

Local & Secure – All data stays on my machine.
Flexible – Local LLMs like Ollama for all AI tasks.
Playful – From Shakespearean rewrites to timeline-based storybook chapters.
Personal – Less enterprise SaaS, more scrapbook with vibes.

I sketched the vision, wrote down the wildest features I could think of, and then got to work building the foundation.

The Core I Built First

1. One-Click Upload + Animated Feedback

I built a Home Page with a big “Upload WhatsApp Chat” button.
It accepts a .zip file with:
- One chat.txt file (the exported WhatsApp text log).
- Media files in a flat folder.
As soon as I uploaded a file, a little animation kicked in:
```
Uploading → Processing Data → Complete
```
When it was done, a clear message appeared:
"Your data is ready for analysis!"

2. Parsing and Enrichment

The parser had to be smart. WhatsApp export formats can vary—some use 12-hour clocks, others 24. Some are DD/MM/YYYY, others MM/DD/YYYY. And multiline messages? Always messy.

For each message, I extracted:

timestamp, sender, text
has_media (based on <Media omitted> or filename match)
emoji_count, word_count, char_count
hour, weekday, and a daypart label (morning, afternoon, evening, night)
A flag if it marked a gap (>2 days between messages)
Another if it looked like a reply

Everything was stored in messages.db, powered by SQLite.

3. Chunking + Insight Files

Next, I divided the chats into time-based chunks—monthly by default, weekly if the volume was high.

For each chunk, I computed:

Total messages
Average message length
Emoji frequency
Nighttime message ratio (10 PM – 3 AM)
Sentiment score (via VADER/TextBlob)
A tone summary using Ollama (locally)

I also assigned each chunk a phase label, like:

"Hi." → “The Emoji Era” → “Ghost Gap #1” → “Memes That Saved Us”

All of that went into a chunk_insights.jsonl file.

4. Global Metadata

At the full-chat level, I calculated:

Total messages/media
Start/end dates
Longest silence
Most active day/hour
Emoji usage ratio
First and last messages
Participants

Stored in global_metadata.json, this sets the stage for overview cards, filters, and story mode navigation later.

Adding Embeddings with FAISS

Since some chats can hit 100K+ messages, I needed a way to retrieve high-context segments efficiently.

So I:

Embedded each chunk summary using a local model
Stored it in a local FAISS index

This enabled smarter context-aware insights within the app—all without touching the internet.

State Management + Notifications

I implemented a backend state machine that tracks:
```
Uploading → Processing Data → Complete
```
A simple GET /status endpoint exposes the current state
The frontend polls this and reflects it in real time, both on the Home Page and a dedicated Status Page

No guesswork. The user always knows what’s happening.

Frontend Integration (No UI Tweaks)

I built a basic react frontend with these pages:

Home Page – Upload button, animated status
Chats Page – Lists uploaded chats with metadata
Settings Page – Lets me configure Ollama as the AI engine
Status Page – Shows real-time updates
How To Use Page – Fully documented, no placeholders
About Page – Brief overview + GitHub link

Everything talks to the backend, cleanly and in sync.

What Comes Next?

With ingestion, parsing, enrichment, and storage in place, I now have a powerful base layer to build on. Next features on my radar:

Chat analytics: Heatmaps, ghost-gap graphs, keyword trends
Storybook mode: Interactive chapter-based timeline
Generative tools:
- Short stories and zines
- Meme captions
- Chatbot simulation in my tone
- Panic archives for motivational posters

Because I’ve stored everything properly—once—I can build all of these without reprocessing a thing.

Why I’m Building It This Way

Most people analyze chats in dashboards.
I want to remember, laugh, and turn those messages into art.

I didn’t just want metrics. I wanted a timeline of shared life, the weird stuff, the quiet patches, the comeback phases, the passive-aggressive emojis.

All offline. All mine.

If you're building something similar—or want to try this out—I’d love feedback or contributions. I’m documenting every piece along the way.

Building Chat History Explorer: From Late-Night Idea to Local AI-Powered WhatsApp Analyzer