Using Fabric OrgApps + Notebooks For Geospatial Data Exploration
data:image/s3,"s3://crabby-images/b2467/b2467ed926e6c9144b97657f3696673fd38e5633" alt="Sandeep Pawar"
Simon Willison is one of my favorite bloggers. In fact, what I blog, how I blog & test, is inspired by him. He wrote a blog a couple of weeks ago about FourSquare Places data that has been open-sourced. I was exploring this dataset and ended up creating a few maps. I love OrgApps in Fabric and I truly believe as it matures, it will be THE way for analysts & data scientists to provide rich insights + traditional reports to business users. Notebooks can augment the Power BI reports to provide insights that are otherwise not possible. I have submitted a session on this topic to FabCon ‘25, let’s see. If it is selected, I hope to show how transformational it is and how businesses can use it.
I won’t go into super details about the code below, but a few things to note:
I used daft to scan 104M rows from an S3 bucket in Fabric Python notebook without downloading the entire dataset. Why daft ? Because it’s optimized for reading S3 data. If you run the below notebook, you will see there is minimal memory & CPU consumption. Look at Simon’s blog above, he used Duckdb. I cleaned the transformed the data lazily using daft.
I also used Polars because polars has a nice altair integration.
Folium for creating interactive maps and timeseries using Plotly.
Notebook is embedded in OrgApps for users to explore the data. You can also embed a Power BI report using
QuickVisualize
for users to explore the data (as long as it is a small dataset).
Steps:
Just download this notebook, import it in your Fabric workspace and execute it.
To get a list of files at this S3 location:
## list of files
s3 = fs.S3FileSystem(region='us-east-1')
path = "s3://fsq-os-places-us-east-1/release/dt=2024-11-19/places/*.parquet"
file_info = s3.get_file_info(fs.FileSelector(
"fsq-os-places-us-east-1/release/dt=2024-11-19/",
recursive=True
))
for info in file_info:
print(info.path)
References
Subscribe to my newsletter
Read articles from Sandeep Pawar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
data:image/s3,"s3://crabby-images/b2467/b2467ed926e6c9144b97657f3696673fd38e5633" alt="Sandeep Pawar"