E-Commerce Analytics: Open-Source BI with Supabase, dbt, Metabase & Airflow

Project Overview

E-commerce businesses rely on data-driven insights to optimize sales, enhance customer experiences, and streamline operations. However, many small and medium-sized enterprises (SMEs) struggle with expensive BI solutions, fragmented data sources, and complex infrastructure management.

This E-Commerce Analytics project provides a fully open-source, cost-effective, and scalable business intelligence solution leveraging Supabase (PostgreSQL), dbt, and Metabase. By deploying this containerized analytics stack, businesses can access real-time insights without vendor lock-in or high licensing costs.

Business Problem

E-commerce companies face the following key challenges:

  1. High BI Tool Costs – Proprietary BI platforms such as Tableau and Looker require costly subscriptions.
  2. Data Fragmentation – Sales, customer, and operational data often reside in multiple disconnected systems.
  3. Technical Complexity – Many SMEs lack the engineering resources to set up robust analytics pipelines.
  4. Scalability Issues – Growing businesses need an analytics stack that can scale with their data needs.

Proposed Solution

The Supabase E-Commerce Analytics project delivers a self-hosted, modular, and scalable data pipeline that:

✔️ Ingests e-commerce data from various sources using Python & SQLAlchemy ✔️ Transforms raw data into structured analytics models with dbt ✔️ Visualizes insights using Metabase dashboards ✔️ Runs seamlessly in Docker containers, enabling fast deployment and easy scalability

Key Benefits

1️⃣ Cost-Effective & Fully Open-Source • No recurring license fees, making it ideal for startups and SMEs • Avoid vendor lock-in and maintain full control over data

2️⃣ Real-Time & Actionable Insights • Prebuilt dashboards for customer behavior, product sales, and operational efficiency • Supports real-time analytics with Supabase’s event-driven architecture

3️⃣ Scalable & Cloud-Agnostic • Deployable on AWS, GCP, or on-premise with Docker & Kubernetes • Designed for growing e-commerce businesses that need analytics at scale

4️⃣ Self-Hosted & Secure • No third-party data exposure, ensuring full compliance with GDPR & CCPA • Customizable row-level security (RLS) policies for controlled access

Use Cases

Sales Performance Optimization • Identify top-selling products and categories • Track customer lifetime value (LTV) and retention trends • Analyze seasonal sales trends to improve forecasting

Customer Segmentation & Personalization • Understand geographic customer distribution and demographics • Optimize marketing strategies by analyzing customer behavior

Operational Efficiency & Cost Reduction • Monitor order fulfillment times & shipping efficiency • Identify supply chain bottlenecks to reduce delays

System Architecture

Tech Stack

ComponentTechnologyWhy?
DatabaseSupabase (PostgreSQL)Managed PostgreSQL with REST APIs & real-time capabilities
ETL & Data IngestionPython (Pandas, SQLAlchemy)Automates data loading into Supabase
Transformationsdbt (Data Build Tool)SQL-based modeling, data transformation, and testing
BI & DashboardsMetabaseOpen-source business intelligence tool
ContainerizationDocker & Docker ComposeSimplifies deployment and orchestration
Orchestration (Optional)Apache Airflow or PrefectAutomates workflow execution if needed later
Version ControlGitHubCode management & CI/CD workflows

Features

  • Containerized Deployment: All services run seamlessly in Docker containers
  • Supabase as Database Backend: PostgreSQL with REST API, authentication, and real-time capabilities
  • dbt Transformations: SQL-based analytics engineering with well-structured models
  • Dimensional Modeling: Proper star schema for business intelligence
  • Interactive Dashboards: Prebuilt Metabase dashboards for actionable insights
  • Fully Open Source: No licensing costs, completely self-hosted with Docker

Dashboards

This project includes two interactive dashboards built with Metabase:

Customer Analytics Dashboard 📊

  • Geographic distribution of customers
  • New customer acquisition trends
  • Customer lifetime value distribution
  • Order value distribution

Customer Analytics Dashboard

Product Analytics Dashboard 🛍️

  • Top-selling products and revenue by category
  • Price point analysis and sales trends over time
  • Product category performance breakdown

Product Analytics Dashboard

Data Models

This project implements a star schema with the following key models:

Staging Layer (Raw Data Preparation)

  • stg_olist__customers.sql - Customer details
  • stg_olist__geolocation.sql - Geographic information
  • stg_olist__order_items.sql - Individual order items
  • stg_olist__order_payments.sql - Payment transactions
  • stg_olist__order_reviews.sql - Customer reviews on orders
  • stg_olist__orders.sql - Order details
  • stg_olist__product_categories.sql - Product categories mapping
  • stg_olist__products.sql - Product details
  • stg_olist__sellers.sql - Seller details

Intermediate Layer (Business Logic Processing)

  • int_customer_orders.sql - Aggregates customer purchase history
  • int_orders_with_items.sql - Combines orders with their respective items
  • int_product_performance.sql - Computes sales performance for products
  • int_seller_performance.sql - Evaluates seller performance

Mart Layer (Analytics-Ready Models)

  • mart_customer_analytics.sql - Customer behavior and retention analysis
  • mart_product_analytics.sql - Product sales and profitability insights
  • mart_seller_analytics.sql - Seller sales and performance tracking

Subcategories

  • Product Analytics:
    • products.sql - Final product-level insights
  • Seller Analytics:
    • sellers.sql - Final seller-level insights

Installation & Setup

1️⃣ Start the Full Stack with Docker

git clone https://github.com/supabase/supabase.git
cd supabase/docker
docker-compose up -d

2️⃣ Configure dbt to Connect to Supabase

Edit ~/.dbt/profiles.yml:

ecommerce_analytics:
  target: dev
  outputs:
    dev:
      type: postgres
      host: localhost
      user: postgres
      password: yourpassword
      port: 5432
      dbname: postgres
      schema: public
      threads: 4

Then test the connection:

dbt debug

3️⃣ Set Up Metabase

# Download Metabase
curl -o metabase.jar https://downloads.metabase.com/latest/metabase.jar

# Run Metabase
java -jar metabase.jar

4️⃣ Load Data into Supabase

python src/etl/loader.py

5️⃣ Run dbt Transformations

cd dbt_project
dbt run
dbt test
dbt docs generate
dbt docs serve

6️⃣ Access Metabase Dashboards

Open http://localhost:3000, configure the PostgreSQL connection, and import the dashboards from metabase/dashboards/.

Extending the Project

Adding New Data Sources

  1. Create a new loader module in data_loader/sources/
  2. Define source tables in dbt sources.yml
  3. Create corresponding staging models

Leveraging Supabase Features

  1. Authentication: Use Supabase auth for dashboard access control
  2. REST API: Build custom applications that connect to your analytics
  3. Real-time Updates: Create live dashboards that update in real-time

Implementing Advanced Analytics

  1. Add derived metrics in dbt models
  2. Implement predictive analytics with dbt Python models

Troubleshooting

Common Issues

Supabase Connection Failures

  • Check Docker container status with docker ps
  • Verify database credentials in your configuration
  • Review logs with docker logs supabase-db

dbt Transformation Errors

  • Run dbt debug to verify configuration
  • Check for SQL syntax errors in models
  • Ensure dependencies are correctly specified

Metabase Connection Issues

  • Verify Supabase credentials in Metabase admin
  • Check network connectivity to Supabase
  • Ensure proper permissions for the Metabase database user

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments


🔗 Live Project Repository: GitHub

🌐 Portfolio Entry: Now featured on www.drjodyannjones.com 🚀

0
Subscribe to my newsletter

Read articles from Dr. Jody-Ann S. Jones directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Dr. Jody-Ann S. Jones
Dr. Jody-Ann S. Jones

🏆 An accomplished Data Science leader, I bring several years of progressive experience, combining sharp business acumen with a passion for delivering impactful data-driven insights. I've successfully steered numerous projects, deploying robust machine learning models and advanced data analysis to drive strategic decision-making and resolve complex business challenges. 📊 My strength lies in my ability to unravel the intricate story hidden within data, transforming abstract patterns into tangible recommendations that propel organizational strategies forward. I excel at communicating these insights, ensuring a clear understanding across diverse business audiences, and aligning technology with end-to-end project processes and business plans. 👏 Beyond my individual contributions, I take pride in fostering the growth of others. I've cultivated high-performing Data Analyst teams, nurturing their development through effective leadership and training.