Running SQL on your Excel Files From Your Laptop with Dremio
Being able to quickly analyze and gain insights from your data is crucial. Excel is widely used for data storage, but when it comes to complex queries and analytics, SQL is often the preferred tool. Dremio, a data lakehouse platform, bridges this gap by allowing you to run SQL queries directly on your Excel files without extensive setup. In this tutorial, I'll guide you through setting up Dremio in a Docker container on your laptop and running SQL queries on an Excel file.
Step 1: Setting Up Dremio Using Docker
To get started, you need to run Dremio on your machine using Docker. This approach ensures that the setup is isolated and does not interfere with other software on your system. Here’s the Docker command to start the Dremio server:
docker run -p 9047:9047 -p 31010:31010 -p 45678:45678 -p 32010:32010 -e DREMIO_JAVA_SERVER_EXTRA_OPTS=-Dpaths.dist=file:///opt/dremio/data/dist --name try-dremio dremio/dremio-oss
This command sets up the necessary port mappings and environment variables for Dremio to operate correctly.
Step 2: Accessing the Dremio UI
Once Dremio is up and running, access the Dremio UI by navigating to localhost:9047 in your web browser. Here, you will log in to the Dremio interface, which is intuitive and user-friendly.
Step 3: Uploading Your Excel File
Dremio makes it simple to upload and manage your data sources. To upload an Excel file:
Click the "+" button in the top right corner of the Dremio UI. Select and upload your Excel file.
You can obtain a sample Excel file with structured data for this tutorial from The Spreadsheet Guru.
Step 4: Running SQL Queries
With your data uploaded, you can now use SQL to query and analyze your Excel file directly:
Navigate to the dataset in Dremio.
Use the SQL Runner to execute SQL queries on your data. This step is where Dremio shines, offering powerful tools to execute complex SQL queries on Excel data, allowing for deeper insights and analysis.
Conclusion
Dremio offers a practical solution for running SQL queries on Excel files, combining the power of SQL with the ubiquity of Excel. By leveraging Docker, setup and scalability are simplified, making this approach ideal for personal projects and enterprise needs alike. Try it out, and start gaining more valuable insights from your data today!
Here are some more amazing tutorials to show you what you can do with Dremio, SQL and the Data Lakehouse.
- End-to-End Basic Data Engineering Tutorial (Spark, Dremio, Superset)
- From Postgres to Dashboards with Dremio and Apache Iceberg
- From SQLServer to Dashboards with Dremio and Apache Iceberg
- From MongoDB to Dashboards with Dremio and Apache Iceberg
- Intro to Dremio, Nessie, and Apache Iceberg on Your Laptop
- Using Flink with Apache Iceberg and Nessie
- Getting Started with Project Nessie, Apache Iceberg, and Apache Spark Using Docker
- Experience the Dremio Lakehouse: Hands-on with Dremio, Nessie, Iceberg, Data-as-Code and dbt
Subscribe to my newsletter
Read articles from Alex Merced directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Alex Merced
Alex Merced
Alex Merced is a developer advocate at Dremio with experience as a developer and instructor. His professional journey includes roles at GenEd Systems, Crossfield Digital, CampusGuard, and General Assembly. He co-authored "Apache Iceberg: The Definitive Guide" published by O'Reilly and has spoken at notable events such as Data Day Texas and Data Council. Alex is passionate about technology, sharing his expertise through blogs, videos, podcasts like Datanation and Web Dev 101, and contributions to the JavaScript and Python communities with libraries like SencilloDB and CoquitoJS. Find all youtube channels, podcasts, blogs, etc. at AlexMerced.com