One Cloud, Two Clouds- How I Built a Pollution Dashboard (and Almost Broke Everything)

aka how I tried 14 different things, broke everything, fixed everything, and somehow made it look clean by the time of the expo
Let’s start with a very real moment...
It was past midnight, I had QuickSight graphs that looked like straight lines, Athena was yelling at me about “HIVE_BAD_DATA,” and somewhere deep inside, I was still thinking:
“Wait—this is kinda fun”
This blog isn’t just about what I built — it’s about how. And more importantly, how many times I failed while trying to make it work
The Idea
We wanted to build something real — something we could imagine running in an actual smart city. So the project idea was:
Can we simulate pollution sensor data, process it entirely through the cloud, and visualize meaningful trends that could help monitor or even reduce pollution?
Simple idea. Complicated journey
Phase 1: Simulate It Till You Make It (Azure IoT + Blob Storage)
We started on Azure. Why? Because real sensors are expensive and I like simulating HUGE data
Created a container in Azure Blob Storage
Then, built a simulated IoT setup using Azure IoT Hub
Pushed random but somewhat realistic values: AQI, PM2.5, Temp, Humidity
This part was... surprisingly smooth? But obviously, I knew something would break soon
Phase 2: Multi-Cloud Drama Begins (Azure → AWS S3)
Okay, now we’re transferring the simulated data to AWS S3.
Set up the bucket, moved data in
.csv
formatThought “Cool Time to process it.”
But the first problem hit hard:
Glue didn’t want to recognize my file.
Glue: “I can’t crawl this.”
Me: “But you said you were serverless and smart??”
Turns out, I had to carefully manage the CSV structure, add headers, and keep formats consistent. Also — permissions, oh god the permissions. Glue wouldn’t write, Athena wouldn’t read. S3 was like “403 LOL.”
Phase 3: Glue, Glue, Everywhere (ETL stage)
Once the crawler worked (after trial #37), I wrote a Glue job to:
Convert CSV to Parquet
Clean the field names
Standardize timestamps
Fun fact: Athena is VERY picky about column types
Typed aqi
as a string by mistake? Congrats, query fails forever.
Phase 4: Athena, You're Supposed to Be Cool
Now came the queries:
sqlCopyEditSELECT date, aqi FROM iot_analysis WHERE aqi > 100;
Athena: “HIVE_BAD_DATA”
Me: “No YOU’RE bad data”
Fixed the schema. Re-partitioned. Cleaned column types in Glue
Eventually... it ran. I cheered. Alone. At 2 AM TT
Phase 5: Let’s Make It Pretty (Frontend + QuickSight)
We built a website — clean, lightweight, showing:
Live AQI from an API
Historical data (Plan A: fetched from Athena, Plan B: simulated)
Then came QuickSight. At first, it showed a line chart where all points looked... flat
Realized: our simulated data was too random and short (3 days). So I regenerated 1 month of pollution data, with a massive spike on Diwali (Oct 24–26) - and the charts came alive
Final Dashboard Highlights
Heatmaps showing hourly pollution trends
Multi-line charts for temp, AQI, humidity
Scatter plots showing PM2.5’s relationship with AQI
And bar charts that made me go, “Okay, this looks like real analysis now”
Expo Day
All of this came together just in time
I was running on 3 hours of sleep, nerves, and excitement.
And when the panel asked, “So how is this data from the cloud?”
I confidently walked them through every twist and turn
And yeah - it worked.
The Takeaways
Multi-cloud = cool but messy. Learn your IAM roles
Simulate like a mad scientist, but structure like a backend dev
Nothing will work on first try. That’s normal
Keep going. Break it. Fix it. Ship it.
What’s Next?
Adding real-time streaming
Building an ML model to predict AQI trends
Maybe connecting it to alert systems or a mobile dashboard
P.S.
If you're reading this and building something like it — I’ve been there
DM me Happy to help (or debug Athena errors with you at 2 AM)
Written by me — with a little help from AI to shape the chaos into a proper blog :)
Subscribe to my newsletter
Read articles from Soham Wagh directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
