Speech Query Proof of Concept

Background
I wanted to give PandaAI a try.
PandaAI transforms your natural language questions into actionable insights — fast, smartly, and effortlessly.
But what data should I use? How about Texas Lottery Scratchoff data!
Why do you have that?
There was an article in the WSJ last month about Rook TX (and Black Swan Capital) winning lotteries when the odds became favorable. The Journal article is paywalled, but you can find info about them elsewhere. Anyway, the Texas Lottery updates the scratch ticket claim data daily, and they’re kind enough to have a CSV. Short story, I started updating this data daily in a Neo4J database, and deployed a Streamlit app to query it.
https://texas-scratchoff-dashboard.streamlit.app
Unfortunately, the claim data doesn’t have the level of detail I had originally hoped for (like store that sold the ticket) so using a graph database doesn’t really buy me anything here. But I had my heart set on using Neo4J, and it’s a thing now.
Getting Started
Database
Neo4J offers free AuraDB accounts, and I have a very basic db with 2 node types and 1 relationship.
call apoc.meta.graph()
example Game node
MATCH (g:Game) WHERE g.game_number = "2413" RETURN g
(: Game {
game_name: "Ca$h Blowout",
prizes_claimed: 3937634,
date_updated: "05/05/2025",
game_close_date: "",
ticket_price: 10.0,
game_number: "2413",
total_prizes: 7302180
})
example Detail node
MATCH (d:Detail) WHERE d.game_number = "2413" AND d.prize_level = "50000" RETURN d
(: Detail {
prize_level: "50000",
prizes_claimed: 3,
game_number: "2413",
total_prizes: 10
})
App
Basic outline
Use Python and the neo4j package to query Neo4J
Load data into a pandas dataframe
Convert pandas df to PandasAI semantic dataframe
User visits Gradio UI
Use Hugging Face transformers to perform speech recognition
Submit transcribed speech to PandasAI chat
Display answer in Gradio
Gradio
Borrow from their [Real Time Speech Recognition](https://www.gradio.app/guides/real-time-speech-recognition) example.
PandasAI? PandaAI?
Is it panda or pandas??? The repo is pandas-ai, you pip install "pandasai>=3.0.0b2"
, but it also includes
and the web site is getpanda.ai. Oh well, moving on. I’m going to try one of their free API keys (but you can optionally configure a different LLM).
Did it work? Kind of!
My bad, the answer cuts out pretty quickly at the end. Here’s a still
Is that the right answer? Yes. Panda(s)AI returns data of different data types depending on the question. I guess it’s a dataframe above, but could be a string, chart, or number too. I spent a couple minutes trying to conditionally handle it in Gradio, but decided it was good enough for this.
Querying inside Neo4J to double check the answer.
MATCH (g:Game)
WHERE g.ticket_price IS NOT NULL
RETURN g.game_name, g.ticket_price
ORDER BY g.ticket_price DESC
LIMIT 5
g.game_name,g.ticket_price
"$5,000,000 Fortune",100.0
$20 Million Supreme,100.0
Loteria Supreme,100.0
$400 Million Mega Bucks,100.0
500X Loteria Spectacular,50.0
It could use a lot of work to be useful, but kind of cool.
Repo here in case you want to see the code used
https://github.com/ThatOrJohn/lottery-texas-scratchoff-voice
Subscribe to my newsletter
Read articles from John Walley directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
