A portable Data Analytics stack using Docker, Mage, dbt-core, DuckDB and Superset

Just wanted to share a small learning-by-doing project of mine. It's a containerized Data Analytics suite, covering end-to-end analytics process for a small (imaginary) company.

We're talking about:
- generating example data in parquet files using Python
- ingesting data into DuckDB
- model data using dbt-core
- loading a DuckDB datamart
- orchestrate using MageAI
- displaying it all in a Superset dashboard.

Each of the components is in a separate Docker container, tied all together with docker-compose.

I've previously set up similar projects with Airflow and Dagster.

It's pretty bare bones (somewhat as intended) and has some rough edges, but it should be a good starting point for a demo, template or learn how all these components works together.

I would of course appreciate any feedback or suggestions on how to make it better.

Found it useful? Check out to my Analytics newsletter at notjustsql.com.

0
Subscribe to my newsletter

Read articles from Constantin Lungu directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Constantin Lungu
Constantin Lungu

Senior Data Engineer • Contractor / Freelancer • GCP & AWS Certified