A portable Data Analytics stack using Docker, Mage, dbt-core, DuckDB and Superset
Just wanted to share a small learning-by-doing project of mine. It's a containerized Data Analytics suite, covering end-to-end analytics process for a small (imaginary) company.
We're talking about:
- generating example data in parquet files using Python
- ingesting data into DuckDB
- model data using dbt-core
- loading a DuckDB datamart
- orchestrate using MageAI
- displaying it all in a Superset dashboard.
Each of the components is in a separate Docker container, tied all together with docker-compose.
I've previously set up similar projects with Airflow and Dagster.
It's pretty bare bones (somewhat as intended) and has some rough edges, but it should be a good starting point for a demo, template or learn how all these components works together.
I would of course appreciate any feedback or suggestions on how to make it better.
Found it useful? Check out to my Analytics newsletter at notjustsql.com.
Subscribe to my newsletter
Read articles from Constantin Lungu directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Constantin Lungu
Constantin Lungu
Senior Data Engineer • Contractor / Freelancer • GCP & AWS Certified