Speech Query Proof of Concept

John WalleyJohn Walley
3 min read

Background

I wanted to give PandaAI a try.

PandaAI transforms your natural language questions into actionable insights — fast, smartly, and effortlessly.

But what data should I use? How about Texas Lottery Scratchoff data!

Why do you have that?

There was an article in the WSJ last month about Rook TX (and Black Swan Capital) winning lotteries when the odds became favorable. The Journal article is paywalled, but you can find info about them elsewhere. Anyway, the Texas Lottery updates the scratch ticket claim data daily, and they’re kind enough to have a CSV. Short story, I started updating this data daily in a Neo4J database, and deployed a Streamlit app to query it.

https://texas-scratchoff-dashboard.streamlit.app

Unfortunately, the claim data doesn’t have the level of detail I had originally hoped for (like store that sold the ticket) so using a graph database doesn’t really buy me anything here. But I had my heart set on using Neo4J, and it’s a thing now.

Getting Started

Database

Neo4J offers free AuraDB accounts, and I have a very basic db with 2 node types and 1 relationship.

call apoc.meta.graph()

example Game node

MATCH (g:Game) WHERE g.game_number = "2413" RETURN g
(: Game {
    game_name: "Ca$h Blowout",
    prizes_claimed: 3937634,
    date_updated: "05/05/2025",
    game_close_date: "",
    ticket_price: 10.0,
    game_number: "2413",
    total_prizes: 7302180
})

example Detail node

MATCH (d:Detail) WHERE d.game_number = "2413" AND d.prize_level = "50000" RETURN d
(: Detail {
    prize_level: "50000",
    prizes_claimed: 3,
    game_number: "2413",
    total_prizes: 10
})

App

Basic outline

Use Python and the neo4j package to query Neo4J

Load data into a pandas dataframe

Convert pandas df to PandasAI semantic dataframe

User visits Gradio UI

Use Hugging Face transformers to perform speech recognition

Submit transcribed speech to PandasAI chat

Display answer in Gradio

Gradio

Borrow from their [Real Time Speech Recognition](https://www.gradio.app/guides/real-time-speech-recognition) example.

PandasAI? PandaAI?

Is it panda or pandas??? The repo is pandas-ai, you pip install "pandasai>=3.0.0b2", but it also includes

and the web site is getpanda.ai. Oh well, moving on. I’m going to try one of their free API keys (but you can optionally configure a different LLM).

Did it work? Kind of!

My bad, the answer cuts out pretty quickly at the end. Here’s a still

Is that the right answer? Yes. Panda(s)AI returns data of different data types depending on the question. I guess it’s a dataframe above, but could be a string, chart, or number too. I spent a couple minutes trying to conditionally handle it in Gradio, but decided it was good enough for this.

Querying inside Neo4J to double check the answer.

MATCH (g:Game)
WHERE g.ticket_price IS NOT NULL
RETURN g.game_name, g.ticket_price
ORDER BY g.ticket_price DESC
LIMIT 5
g.game_name,g.ticket_price
"$5,000,000 Fortune",100.0
$20 Million Supreme,100.0
Loteria Supreme,100.0
$400 Million Mega Bucks,100.0
500X Loteria Spectacular,50.0

It could use a lot of work to be useful, but kind of cool.

Repo here in case you want to see the code used

https://github.com/ThatOrJohn/lottery-texas-scratchoff-voice

0
Subscribe to my newsletter

Read articles from John Walley directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

John Walley
John Walley