Py-DockerDB: Simplifying Programmatic Database Handling

If you’ve ever worked on a backend or data-centric project in a small team, chances are you’ve hit the same wall I did: setting up local databases consistently, reliably, and without friction.

It’s a deceptively simple task. And yet, it’s where many projects start to feel fragile.

The Problem: The Local Database Setup Spiral

Picture this: you're building a microservice architecture that talks to a PostgreSQL database. You’re working with two other developers. You write a quick README:

“Make sure you have Postgres installed. Create a user, a password, a database. Import this SQL script. Use port 5432 unless it’s taken.”

You think it’s fine. Then the pull requests start rolling in with bugs that don’t make sense. Someone’s DB is misconfigured. Someone else forgot to run the schema script. Another person installed the wrong version of Postgres on Windows and it won’t even start.

And when you try to onboard a new teammate? If it's been a while since anyone performed the setup, it can quickly turn into a full afternoon of troubleshooting.

Local database setup is deceptively expensive. It introduces variance into your dev environments and bakes hidden assumptions into your codebase.

Even with Docker, it's rarely elegant. You might end up with a mess of docker-compose files, environment variables, half-broken shell scripts, and manual volume mounts that no one dares touch.

A Real Example: Automating WG-Gesucht Notifications

In my case, this hit home during development of a bot that scrapes listings from WG-Gesucht (a German apartment-sharing site) and automatically alerts users based on their preferences. It’s built as a collection of microservices:

  • A scraper service (pulls data and stores in Postgres),

  • A vector search module using pgvector (to recommend listings),

  • A notification dispatcher (integrates with email/Telegram).

Each service uses a local DB during development and testing. I needed to:

  • Quickly spin up Postgres and MySQL instances with test data,

  • Run init scripts and seed content from Python,

  • Share configurations with collaborators using Jupyter and notebooks,

  • Avoid any OS-specific setup pain.

Docker is an obvious choice—but I didn’t want my dev flow to depend on Docker CLI commands buried in scripts. I wanted everything runnable in Python, so it could live side-by-side with my logic and be testable, restartable, and explicit.

What the Right Solution Should Look Like

At this point, I had some clear goals in mind for a better approach to local databases:

  • Python-first interface: no shell scripts, no docker-compose.yml, no Makefiles.

  • Minimal dependencies: no installing client tools or external setup scripts.

  • Cross-platform: works on macOS, Linux, and Windows (even WSL).

  • Supports init scripts and volumes: I want to seed data, or persist it.

  • Clear lifecycle control: I want to .start_db(), .stop_db(), .delete_db() like any Python object.

This is the tool I wish existed from the start. So I built it.

Introducing py-dockerdb

py-dockerdb is a Python library that lets you manage real Dockerized databases like native Python objects. Visit the project on github for more info and usage examples!

With just a few lines of code, you can spin up Postgres, MongoDB, MySQL, or SQL Server, inject init scripts, connect with familiar Python drivers—and tear them down cleanly when done.

No shell, no YAML, no guesswork.

from docker_db.postgres_db import PostgresConfig, PostgresDB

config = PostgresConfig(
    user="botuser",
    password="botpass",
    database="wggesucht_db",
    container_name="local-postgres"
)

db = PostgresDB(config)
db.create_db()

conn = db.connection
cursor = conn.cursor()
cursor.execute("SELECT COUNT(*) FROM listings;")
print(cursor.fetchone())

This is running an actual PostgreSQL instance in Docker, spun up with an init script, ready for interaction, and controlled completely from Python.

How It Works

Every database type has two classes:

  • A Config class that defines connection settings and init behavior.

  • A DB class that manages lifecycle: create_db(), stop_db(), delete_db(), restart_db().

It supports:

  • Init scripts: SQL, JS, SH, depending on the DB engine.

  • Volume persistence: so you can reuse data across runs.

  • Environment injection: useful for script templating.

  • Native drivers: psycopg2, pymongo, pyodbc, mysql-connector.

All you need is Docker and Python 3.7+ and a running docker instance on the host machine.

What You Can Use It For

Here are some use cases I’ve explored or seen:

  • Data science notebooks with SQL backends that boot on demand.

  • CI testing environments that require disposable database containers.

  • Teaching SQL or NoSQL without asking students to install anything.

  • Microservice development with predictable, isolated DB instances.

  • Rapid prototyping for apps that need seeded data on day one.

Philosophy

py-dockerdb is intentionally minimal. It’s not trying to replace docker-compose for full-stack orchestration. It doesn’t scaffold services or guess your intentions.

Instead, it focuses on one thing: let you control local databases, entirely from Python, using real Docker containers.

No DSLs. No hidden automagic. Just code.

Supported Databases

  • PostgreSQL

  • MySQL

  • MongoDB

  • Microsoft SQL Server

And Cassandra is on the roadmap.

Installation

Just install it from PyPI:

pip install py-dockerdb

And you're ready to go.

The Bottom Line

Working with databases locally shouldn’t be an afterthought. It’s one of the most repeated steps in any backend, data, or devops workflow. Yet we still hand-wave it away with vague instructions and flaky setup scripts.

With py-dockerdb, you can keep database setup alongside your logic—reproducible, isolated, and inspectable.

Your teammates (and future self) will thank you.

0
Subscribe to my newsletter

Read articles from Amadou Wolfgang Cisse directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Amadou Wolfgang Cisse
Amadou Wolfgang Cisse