Building Chat History Explorer: From Late-Night Idea to Local AI-Powered WhatsApp Analyzer

KamallrajKamallraj
4 min read

April 2025 – This project began the same way most of my side projects do: with a .zip file, a bit of curiosity, and an urge to see what would happen if I could analyze my own WhatsApp chats.

At first, the idea was simple—count some messages, maybe chart when conversations peaked. But the more I thought about it, the more I realized: this could be so much more.

What if I could generate stories out of chats?
What if I could visualize ghost gaps, or recreate my chaotic message threads as memes?
What if I could build a memory timeline, all running entirely offline?

So, I built Chat History Explorer—a privacy-first, locally hosted app that turns exported WhatsApp chats into analytics, timelines, and even creative outputs like zines and AI-written stories.


The Starting Point: What Did I Want?

I wasn’t aiming for another sterile dashboard.

I wanted something that was:

  • Local & Secure – All data stays on my machine.
  • Flexible – Local LLMs like Ollama for all AI tasks.
  • Playful – From Shakespearean rewrites to timeline-based storybook chapters.
  • Personal – Less enterprise SaaS, more scrapbook with vibes.

I sketched the vision, wrote down the wildest features I could think of, and then got to work building the foundation.


The Core I Built First

1. One-Click Upload + Animated Feedback

  • I built a Home Page with a big “Upload WhatsApp Chat” button.
  • It accepts a .zip file with:
    • One chat.txt file (the exported WhatsApp text log).
    • Media files in a flat folder.
  • As soon as I uploaded a file, a little animation kicked in:
    Uploading → Processing Data → Complete
    
  • When it was done, a clear message appeared:
    "Your data is ready for analysis!"

2. Parsing and Enrichment

The parser had to be smart. WhatsApp export formats can vary—some use 12-hour clocks, others 24. Some are DD/MM/YYYY, others MM/DD/YYYY. And multiline messages? Always messy.

For each message, I extracted:

  • timestamp, sender, text
  • has_media (based on <Media omitted> or filename match)
  • emoji_count, word_count, char_count
  • hour, weekday, and a daypart label (morning, afternoon, evening, night)
  • A flag if it marked a gap (>2 days between messages)
  • Another if it looked like a reply

Everything was stored in messages.db, powered by SQLite.

3. Chunking + Insight Files

Next, I divided the chats into time-based chunks—monthly by default, weekly if the volume was high.

For each chunk, I computed:

  • Total messages
  • Average message length
  • Emoji frequency
  • Nighttime message ratio (10 PM – 3 AM)
  • Sentiment score (via VADER/TextBlob)
  • A tone summary using Ollama (locally)

I also assigned each chunk a phase label, like:

"Hi." → “The Emoji Era” → “Ghost Gap #1” → “Memes That Saved Us”

All of that went into a chunk_insights.jsonl file.

4. Global Metadata

At the full-chat level, I calculated:

  • Total messages/media
  • Start/end dates
  • Longest silence
  • Most active day/hour
  • Emoji usage ratio
  • First and last messages
  • Participants

Stored in global_metadata.json, this sets the stage for overview cards, filters, and story mode navigation later.


Adding Embeddings with FAISS

Since some chats can hit 100K+ messages, I needed a way to retrieve high-context segments efficiently.

So I:

  • Embedded each chunk summary using a local model
  • Stored it in a local FAISS index

This enabled smarter context-aware insights within the app—all without touching the internet.


State Management + Notifications

  • I implemented a backend state machine that tracks:
    Uploading → Processing Data → Complete
    
  • A simple GET /status endpoint exposes the current state
  • The frontend polls this and reflects it in real time, both on the Home Page and a dedicated Status Page

No guesswork. The user always knows what’s happening.


Frontend Integration (No UI Tweaks)

I built a basic react frontend with these pages:

  • Home Page – Upload button, animated status
  • Chats Page – Lists uploaded chats with metadata
  • Settings Page – Lets me configure Ollama as the AI engine
  • Status Page – Shows real-time updates
  • How To Use Page – Fully documented, no placeholders
  • About Page – Brief overview + GitHub link

Everything talks to the backend, cleanly and in sync.


What Comes Next?

With ingestion, parsing, enrichment, and storage in place, I now have a powerful base layer to build on. Next features on my radar:

  • Chat analytics: Heatmaps, ghost-gap graphs, keyword trends
  • Storybook mode: Interactive chapter-based timeline
  • Generative tools:
    • Short stories and zines
    • Meme captions
    • Chatbot simulation in my tone
    • Panic archives for motivational posters

Because I’ve stored everything properly—once—I can build all of these without reprocessing a thing.


Why I’m Building It This Way

Most people analyze chats in dashboards.
I want to remember, laugh, and turn those messages into art.

I didn’t just want metrics. I wanted a timeline of shared life, the weird stuff, the quiet patches, the comeback phases, the passive-aggressive emojis.

All offline. All mine.

If you're building something similar—or want to try this out—I’d love feedback or contributions. I’m documenting every piece along the way.

0
Subscribe to my newsletter

Read articles from Kamallraj directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Kamallraj
Kamallraj