Hello Techies👋! I’m Samiksha, Hope you all are doing amazing stuff. Welcome to another blog about building a package code which can easily be shipped from POC to production. This blog is a practical extension of the project “AI Consultant hybrid-rag Package“ to discuss the best possible approaches I used to built this project.

Please checkout this blog to get the deep dive into RAG and how to scale it from POC to production: https://teckbakers.hashnode.dev/ai-consultant-hybrid-rag-chatbot

What Does the Production Scale AI application means?

Production scale AI application means building systems that Scale in production.

A production-ready AI application is a stable, reliable, and secure system designed to operate effectively and meet real-world user demands, rather than just a prototype. It must be scalable to handle high loads, cost-effective at scale, and incorporate essential features like robust monitoring, comprehensive security, privacy controls, and efficient operational processes for deployment, updates, and ongoing management. This includes addressing challenges such as hallucinations, data governance, and user experience to provide value continuously.

Key Characteristics of a Production-Ready AI Application
- Scalability:
  
  The system can handle a large and growing number of users and requests without performance degradation.
- Reliability & Stability:
  
  It is robust and can consistently deliver accurate results, remaining available and functioning correctly even under heavy load or unexpected conditions.
- Security & Privacy:
  
  It protects sensitive user data, prevents unauthorized access, and mitigates emerging security threats, including those specific to AI models.
- Cost-Effectiveness:
  
  The infrastructure and tools used are economically viable for production-level usage.
- Monitoring & Observability:
  
  It includes mechanisms to continuously track performance, identify issues, and understand system behavior in real time.
- Operational Efficiency:
  
  Features automated deployments, continuous delivery, and efficient processes for updating models and managing the application in the long term.
- Data Governance:
  
  Data used for training and operation complies with relevant data governance policies and standards, ensuring quality, privacy, and security.
- Quality & User Experience:
  
  It minimizes AI-specific quality issues like hallucinations and incomplete answers, providing a seamless and valuable experience for end-users.
- Error Handling & Resilience:
  
  The application has comprehensive error handling to gracefully manage failures and stay available even when parts of the system encounter problems

Without further ado, let’s Discuss the Best code Practices that allows to built the AI systems fundamentally robust through code practices.

Below I’m going to discuss the approaches I used in Advance RAG system packages code building.

Practical Code Practices to build Robust AI Systems:

Checkout this Repository: https://github.com/kolhesamiksha/Hybrid-Search-RAG. where I have followed Best modular approaches to built every component as customized and isolated to integrate by any way to built hybrid-rag system for your individual usecase

Let’s start discussing one by one component…

Follow the above code structure, it’s most practical and useful for building production-ready AI solutions & follow best practices for easy code development, faster deployment and less bottlenecks.

CI/CD pipelines for managing the continuous development environment.

When building robust LLM applications, a well-organized project structure is crucial for maintainability, scalability, and collaboration. Here's a breakdown of the key directories and their purposes:

Package Building:

Why This Layout?
1. src/ layout → Best practice for Python packaging. It prevents import errors when testing and ensures you always install the package before running it.
2. Modularization → Each major feature (RAG, prompts, summarization, vector DB, moderation, guardrails, evaluation, etc.) gets its own subpackage.
3. Packaging Ready → When you build with Poetry (make build → creates a .whl), only the hybrid_rag/src/... modules are included.

🔹 Package Structure Explained

    hybrid_rag/
    │
    ├── src/
    │   ├── advance_rag/        # Advanced retrieval-augmented generation (extended logic)
    │   ├── custom_asr_mlflow/  # Custom ASR (speech-to-text) integrated with MLflow
    │   ├── custom_mlflow/      # Custom MLflow utilities for model logging/tracking
    │   ├── evaluate/           # Evaluation metrics (e.g., ragas, ranking, scoring)
    │   ├── guardrails/         # Guardrails for safe AI responses (filters, policies)
    │   ├── models/             # Model wrappers/adapters (LLMs, embeddings, ASR, etc.)
    │   ├── moderation/         # Content moderation (OpenAI filters, safety layers)
    │   ├── prompts/            # Prompt templates & chains for LangChain workflows
    │   ├── summarization/      # Summarization logic for long contexts
    │   ├── utils/              # Shared utility functions
    │   ├── vectordb/           # Vector DB integrations (FAISS, Milvus, etc.)
    │   │
    │   ├── __init__.py         # Makes src/ a Python package
    │   ├── config.py           # Central config (env vars, paths, constants)
    │   ├── log_model.py        # Logging models with MLflow
    │   ├── rag.py              # Core RAG pipeline orchestrator
    │   ├── summarizer.py       # Summarization entry point
    │
    ├── README.md               # Documentation
    ├── api.py                  # FastAPI entrypoint
    ├── porter.py               # Probably helpers for loading/deploying pipelines
    ├── test.py                 # Quick test runner
    └── __init__.py             # Package root

🔹 How This Supports Package Building

All modules live inside src/ → So when you install with Poetry or pip install ., Python sees:
```
 import hybrid_rag
 from hybrid_rag.prompts import ...
 from hybrid_rag.vectordb import ...
```
instead of polluting root directory.
Each feature = isolated folder
→ Encourages modular development. For example:
- Someone working only on summarization doesn’t touch vectordb.
- Easier testing, easier maintenance.
Wheel/distribution ready
- The pyproject.toml → [tool.poetry] packages = [{ include = "hybrid_rag" }] ensures only hybrid_rag is bundled when building a wheel.
- Running:
```
  poetry build
```
  creates a .whl that includes src/ contents.

✅ In short: This package design follows src layout + modular components → best practices for building Python packages that are clean, testable, and easily distributable

common folders for their respective purposes:

tests/:
- Purpose: Contains all test files mirroring the source structure
- Best Practice: Follow the same directory structure as src/ for easy test discovery
- Example: tests/unit/, tests/integration/

config/:

Purpose: Stores all configuration files.
Contents:
- Environment variables
- Model parameters
- API credentials (securely managed)
- Feature flags

docs/:

Purpose: Comprehensive project documentation
Should Include:
- API documentation
- Setup guides
- Architecture diagrams
- Contribution guidelines

notebooks/:

Purpose: For exploratory data analysis, prototyping, and demos
Best Practice: Keep notebooks focused and well-documented

examples/:

Purpose: One-off scripts for data processing, setup, or maintenance
Example: Data preprocessing, model training, or deployment scripts

workflows/:

Purpose: Contains CI/CD pipeline definitions
Example: GitHub Actions, GitLab CI, or other workflow files.

design/:

Purpose: Contains all architecture diagrams/ppt of the project/product.

.env.example:

Purpose: Contains env variables, configurations, constants etc.
Example: In the below project i have added it as a configuration file where all plugins for the package to be configurable based on user’s needs are given there..

Automation testing(tests/) for CI pipeline:

writing tests in pytest is straightforward. That said, there are some guidelines you should follow to write basic tests effectively.

To begin writing basic tests in pytest, create a Python file with names prefixed with test to indicate they contain tests. Within each test function, use names that clearly describe what you are testing — for example, test_addition() or test_file_reading().

To assert outcomes, use built-in assert statements or pytest’s assert helper functions like assertEqual() or assertTrue(). Ensure each test function raises an AssertionError if the condition fails, providing clear feedback on the test outcome.
pre-commit hooks:

Pre-commit hooks in Python are scripts executed automatically by Git before a commit is finalized. Their primary purpose is to ensure code quality, consistency, and adherence to project standards by running various checks and transformations on the staged files. If any of these checks fail, the commit process is aborted, allowing the developer to address the issues before committing the changes.

The pre-commit framework is a popular tool for managing and running these hooks.

Common Python pre-commit hooks:
- black: An opinionated code formatter that ensures consistent code style.
- flake8: A tool that checks for PEP 8 compliance and other common Python errors.
- isort: Sorts and organizes Python imports alphabetically and by type.
- mypy: A static type checker for Python.
- trailing-whitespace: Removes trailing whitespace from files.
- end-of-file-fixer: Ensures files end with a single newline.
- check-yaml: Checks YAML file syntax.
- check-added-large-files: Prevents committing excessively large files.

By implementing pre-commit hooks, development teams can enforce coding standards, improve code quality, and reduce the likelihood of introducing bugs early in the development cycle.

Below hooks I implemented in the project:

Code Formatting & Style Enforcement – To ensure our Python codebase remains clean, readable, and consistent, we integrate tools like Black, Ruff, Reorder Python Imports, and PyUpgrade. Black auto-formats code into a uniform style, Ruff enforces linting rules and cleans up imports, Reorder Python Imports standardizes how libraries are arranged, while PyUpgrade automatically rewrites outdated syntax into modern Python (≥3.8) best practices. Together, these tools prevent unnecessary formatting debates, eliminate style drift, and ensure developers focus on solving problems rather than nitpicking code formatting.
Quality & Safety Checks – The pre-commit-hooks package provides lightweight but essential checks like validating docstring placement, ensuring YAML/TOML files are valid, fixing trailing whitespace, enforcing final newlines, and blocking large or unwanted files from sneaking into commits. These checks act as a guardrail against common human mistakes that can lead to broken builds, corrupted configs, or messy Git histories. By automating these checks, we shift quality control to the commit stage, catching issues early before they reach CI/CD or production environments.
Static Analysis & Type Safety – We use Flake8 and Mypy to bring strong static analysis into the workflow. Flake8 extends beyond formatting to catch logical issues, unused variables, or overly complex functions, while Mypy adds a layer of type checking to catch potential runtime errors before execution. Importantly, Mypy is configured to focus on our hybrid_rag package, enforcing stricter type guarantees where correctness matters most. This pairing ensures our code is not just pretty, but also safe, predictable, and easier to maintain at scale.
Documentation & Spell Checking – To keep our project professional and readable, we integrate Codespell for catching common spelling errors while excluding non-relevant files like binaries, notebooks, and configs. At the same time, we use Towncrier to automate changelog generation directly from commit messages, ensuring that every feature, bug fix, and improvement is properly documented without relying on manual bookkeeping. This makes releases traceable, transparent, and compliant with good software engineering practices, while also maintaining polished developer-facing and user-facing communication.
Automation & Continuous Improvement – The configuration includes fine-tuned exclusions (like tests/, examples/, docs/, and notebooks) to avoid unnecessary noise, while pre-commit.ci automates running hooks, fixing code, and auto-updating hook versions. This creates a living guardrail where developers don’t need to remember every rule—violations are caught, fixed, and committed automatically. Over time, this system enforces higher code quality standards, reduces human error, improves productivity, and ensures the project evolves with modern tooling best practices, all while keeping the Git history clean and meaningful.

Makefile:

A Makefile can be used to automate various tasks within a Python project, streamlining development and ensuring consistency. This is achieved by defining targets, dependencies, and commands within a file named Makefile (case-sensitive, M must be uppercase) in the root of your project.

Common Automation Tasks in Python with Makefiles:
- Virtual Environment Management:
  - Creating a virtual environment.
  - Activating the virtual environment.
  - Installing dependencies within the virtual environment.
- Testing:
  - Running unit tests (e.g., with pytest).
  - Generating code coverage reports.
- Code Quality & Formatting:
  - Running linters (e.g., flake8, pylint).
  - Running static type checkers (e.g., mypy).
  - Formatting code (e.g., black).
- Documentation:
  - Building documentation (e.g., with Sphinx).
- Cleanup:
  - Removing build artifacts, temporary files, or virtual environments.

This Makefile provides a streamlined developer workflow for the hybrid-rag project, making it easier to install dependencies, run tests, lint, format, type-check, and build releases without remembering long commands. It acts as a command-line toolbox where you simply run make <command> to perform common project tasks.

Installation & Setup – To set up your development environment, run make install, which installs the package in editable mode with dev, lint, typing, and codespell dependencies via Poetry. If you’re using pre-commit hooks, run make install-precommit to install and activate them, or make run-precommit to apply them across all files.
Code Quality & Checks – The Makefile provides shortcuts for ensuring code quality. Run make lint to check code style with Ruff and Flake8, make format to auto-format code with Black, make type-check to verify types using Mypy, and make codespell to catch spelling mistakes in code and docs. These commands enforce consistency and correctness across the codebase.
Testing & Validation – To validate functionality, simply run make test, which executes tests using Pytest inside the tests/ directory. This ensures changes don’t break existing features. If you want to combine style and correctness checks, you can run pre-commit hooks directly, or CI will enforce them automatically.
Building & Releasing – To package the project for distribution, use make build, which generates a wheel file via Poetry. You can also run make changelog to automatically generate release notes with Towncrier, keeping track of new features and fixes without manual effort.
Cleanup & Maintenance – The make clean command removes build artifacts, caches, and temporary files (dist, build, .mypy_cache, .ruff_cache, etc.), ensuring you always have a clean workspace. This is especially useful before fresh builds or when troubleshooting dependency issues.

Quick Reference Commands

make install → Install dependencies (dev + lint + typing + codespell)
make install-precommit → Install pre-commit hooks
make run-precommit → Run pre-commit checks on all files
make test → Run all tests with Pytest
make lint → Run linting with Ruff & Flake8
make format → Auto-format code with Black
make type-check → Run static type checks with Mypy
make codespell → Catch typos with Codespell
make build → Build the package as a wheel
make changelog → Generate changelog with Towncrier
make clean → Clean up build and cache files

✅ In short: This Makefile makes common project tasks one-line commands instead of long Poetry/Pytest/Pre-commit invocations, ensuring a smooth, consistent developer experience.

Dockerfile:

This Dockerfile is divided into two stages—a Build stage where dependencies are compiled and the project is packaged into a wheel file, and a Runtime stage where only the minimal runtime environment is kept, making the final image smaller, cleaner, and production-ready.

🔹 Stage 1: Build Environment
```
 FROM python:3.11-slim as builder
```
- Uses the lightweight Python 3.11 slim image as the base.
- Sets the working directory to /build.
- Installs essential system dependencies like ffmpeg, make, build-essential, and Poetry for dependency management.

    COPY hybrid_rag hybrid_rag
    COPY tests tests
    COPY .pre-commit-config.yaml .pre-commit-config.yaml
    COPY Makefile Makefile
    COPY poetry.toml poetry.toml
    COPY pyproject.toml pyproject.toml

Copies source code (hybrid_rag), tests, and config files into the build container.

    RUN make build

Runs make build, which uses Poetry to package the project as a wheel file (.whl).
This step ensures your code compiles correctly, and dependencies are locked before moving into production.

✅ Goal: Compile and package the app in a clean, isolated build environment.

🔹 Stage 2: Runtime Environment

    FROM python:3.11-slim
    WORKDIR /Hybrid-Search-RAG

Starts from a fresh Python 3.11 slim image (lighter than keeping build tools).
Uses /Hybrid-Search-RAG as the working directory.

    RUN apt-get update && apt-get install -y --no-install-recommends supervisor

Installs Supervisor, a lightweight process manager that will run FastAPI and Streamlit together.

    COPY --from=builder /build/dist/*.whl .
    RUN pip install *.whl && rm -rf *.whl

Copies the wheel file built in Stage 1.
Installs it via pip and deletes the wheel afterward to save space.
This ensures the runtime image only has the final package, not the entire source + build tools.

🔹 Application Setup

    COPY chat_restapi chat_restapi
    COPY chat_streamlit_app chat_streamlit_app
    COPY .env.example chat_restapi/.env.example
    COPY .env.example chat_streamlit_app/.env.example

Copies your FastAPI app (chat_restapi) and Streamlit app (chat_streamlit_app) into the container.
Provides sample .env.example files for environment variable configuration.

    RUN mkdir -p /etc/supervisor/conf.d
    COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf

Prepares Supervisor configuration to manage both services.

    EXPOSE 8000 8501
    CMD ["supervisord", "-n"]

Exposes port 8000 for FastAPI and port 8501 for Streamlit.
Launches Supervisor in foreground mode (-n), which keeps the container running while it manages both processes.

✅ Goal: Provide a minimal, production-ready runtime with FastAPI + Streamlit served in one container.

🚀 Why This Design?

Multi-stage build → Keeps final image small by separating build tools from runtime.
Wheel packaging → Ensures reproducible builds and dependency locking.
Supervisor → Lets you run multiple apps (FastAPI + Streamlit) in one container.
Lightweight runtime → Installs only what’s needed for execution, saving space and improving security.

👉 In short:
This Dockerfile builds your project into a wheel in Stage 1, then creates a slim runtime image in Stage 2 with just Python, Supervisor, and your packaged app. It runs both FastAPI (backend) and Streamlit (frontend) together under Supervisor, ready for deployment.

pyproject.toml:

This pyproject.toml configures the build system, dependencies, development tools, and quality checks for the hybrid-rag project. It’s designed for a modern Python workflow using Poetry as the dependency manager and PEP 621 style project metadata.

🔹 1. Build System
```
 [build-system]
 requires = ["poetry-core>=1.0.0"]
 build-backend = "poetry.core.masonry.api"
```
- Uses Poetry Core as the build backend.
- Ensures reproducible builds into wheels (.whl) or source distributions.

🔹 2. Project Metadata

    [tool.poetry]
    name = "hybrid-rag"
    version = "0.1.0"
    description = "A Hybrid-search RAG repository"
    authors = ["Samiksha Kolhe <kolhesamiksha25@gmail.com>"]
    repository = "https://github.com/kolhesamiksha/Hybrid-Search-RAG"
    keywords = ["streamlit", "langchain", "openai", "rag", "python", "groq", "hybrid search"]
    packages = [{ include = "hybrid_rag" }]

Defines project name, version, description, authors, repo link, and keywords (helps with discovery in PyPI).
packages ensures only the hybrid_rag module is included in builds.

🔹 3. Core Dependencies

    [tool.poetry.dependencies]
    python = ">=3.11,<3.13"

Project supports Python 3.11 and 3.12.
Dependencies include:
- Frontend: streamlit, streamlit-chat, streamlit-elements
- LangChain + LLMs: langchain, langchain-core, langchain-openai, langchain-groq, fastembed, faiss-cpu, pymilvus
- Backend: fastapi, uvicorn, python-dotenv
- AI/ML Stack: torch, transformers, torchaudio, sentencepiece, librosa, soundfile, onnxruntime
- Utils: boto3, mlflow, pymongo, psutil, slowapi, memory-profiler, pycryptodome
- Eval & Ranking: ragas, flashrank, datasets

✅ This makes it a hybrid RAG framework that supports text embeddings, vector DBs, LLMs, audio processing, observability, and model evaluation.

🔹 4. Development & Tooling Dependencies

Grouped under poetry extras for modular installs:

Linting & Formatting

  [tool.poetry.group.lint.dependencies]
  ruff = "^0.5.0"
  flake8 = "^6.0.0"

→ Enforces coding standards.

Typing

  [tool.poetry.group.typing.dependencies]
  mypy, types-requests, types-pyyaml, etc.

→ Strong static typing for safety.

Dev & Testing
```
  [tool.poetry.group.dev.dependencies]
  pytest, pytest-asyncio, black, coverage
```
→ Supports unit testing, async tests, formatting, and coverage reports.

Codespell

  [tool.poetry.group.codespell.dependencies]
  codespell

→ Catches typos in code & docs.

🔹 5. Tool Configurations

Towncrier

  [tool.towncrier]
  package = "hybrid-rag"
  filename = "CHANGELOG.md"

→ Auto-generates changelogs from commit fragments.

Coverage

  [tool.coverage.run]
  omit = ["tests/*", "examples/*", "docs/*"]

→ Excludes non-core code from coverage reports.

Mypy

  [tool.mypy]
  ignore_missing_imports = true
  disallow_untyped_defs = true
  exclude = ".*(tests|examples|docs).*"

→ Strict type checking, skipping test/example/docs directories.

Ruff

  [tool.ruff]
  extend-ignore = ["*.ipynb"]
  [tool.ruff.lint]
  select = ["I", "T201"]

→ Enforces import sorting (I) and bans print statements (T201).

Pytest

  [tool.pytest.ini_options]
  addopts = "--strict-markers --strict-config --durations=5 -vv"
  markers = ["requires", "scheduled", "compile"]

→ Enforces strict configs, detailed output, and custom test markers.

Codespell

  [tool.codespell]
  skip = '.git,*.pdf,...'
  ignore-regex = '.*(Stati Uniti|Tense=Pres).*'
  ignore-words-list = 'momento,collison,...'

→ Skips irrelevant files, ignores false positives, and customizes typo handling.

✨ In Summary

pyproject.toml = Project Blueprint.
Defines metadata, dependencies, and tooling configs in one place.
Supports RAG, LLMs, embeddings, and audio processing.
Enforces linting, typing, testing, and spell checking via Poetry groups.
Provides consistent, reproducible builds with Poetry + wheel packaging.

You are now all set!! If you like this article and want more such articles, then follow teckbakers Generative AI series. Soon going to start Inference optimization for Model deployment articles!! stay tuned for the updates.

Please feel free to contribute to this article in comments, share your insights and experience in optimizing the products or best code practices for your Large scale applications which impacted your production code. This will help everyone to learn from each others experience!!.

till then, Stay tuned and follow our newsletter to get daily updates & Built Project End to end!! Connect with me on linkedin, github, kaggle.

Let's Learn and grow together:) Stay Healthy stay Happy✨. Happy Learning!!

AI Engineering: Fundamental Code Practices To Built a Production Ready AI Applications

Table of contents

What Does the Production Scale AI application means?

Practical Code Practices to build Robust AI Systems:

Package Building:

🔹 Package Structure Explained

🔹 How This Supports Package Building

Automation testing(tests/) for CI pipeline:

pre-commit hooks:

Makefile:

Quick Reference Commands

Dockerfile:

🔹 Stage 1: Build Environment

🔹 Stage 2: Runtime Environment

🔹 Application Setup

🚀 Why This Design?

pyproject.toml:

🔹 1. Build System