Project Structure

Yash MainiYash Maini
2 min read

A well-structured project makes it easier to scale, debug, and maintain code. Below is the directory structure of Predict-Pipe:

Predict_Pipe/
│── .github/workflows/.gitkeep
│── config/
│   ├── config.yaml
│── params.yaml
│── main.py
│── Dockerfile
│── setup.py
│── research/
│   ├── trials.ipynb
│── templates/
│   ├── index.html
│── src/
│   ├── Predict_Pipe/
│   │   ├── __init__.py
│   │   ├── components/
│   │   │   ├── __init__.py
│   │   ├── utils/
│   │   │   ├── __init__.py
│   │   │   ├── common.py
│   │   ├── logging/
│   │   │   ├── __init__.py
│   │   ├── config/
│   │   │   ├── __init__.py
│   │   │   ├── configuration.py
│   │   ├── pipeline/
│   │   │   ├── __init__.py
│   │   ├── entity/
│   │   │   ├── __init__.py
│   │   │   ├── config_entity.py
│   │   ├── constants/
│   │   │   ├── __init__.py

NOTE: This is the directory structure based on template file given below, files will be added manually as per demand inside the respective directory

Key Files & Directories

  • .github/workflows/: Stores CI/CD workflows.

  • config/: Stores configuration files like config.yaml.

  • params.yaml: Defines hyperparameters and other settings.

  • main.py: The entry point of the application.

  • Dockerfile: Defines the containerization process.

  • setup.py: Script for packaging and installing the project.

  • research/: Stores Jupyter notebooks for experimentation.

  • templates/: Contains HTML templates for web-based interactions.

  • src/: The core codebase containing:

    • components/: Modules for data ingestion, validation, transformation, training, and evaluation.

    • utils/: Utility functions like saving/loading models.

    • logging/: Custom logging setup.

    • config/: Configuration handling.

    • pipeline/: Orchestrates the ML pipeline.

    • entity/: Stores entity definitions.

    • constants/: Defines project-wide constants.


Project Template Script

To automate the creation of this project structure, we use the following Python script:

import os
from pathlib import Path
import logging

logging.basicConfig(level=logging.INFO, format='[%(asctime)s]: %(message)s:')

project_name='Predict_Pipe'

list_of_files=[
    ".github/workflows/.gitkeep",
    f"src/{project_name}/__init__.py",
    f"src/{project_name}/components/__init__.py",
    f"src/{project_name}/utils/__init__.py",
    f"src/{project_name}/utils/common.py",
    f"src/{project_name}/logging/__init__.py",
    f"src/{project_name}/config/__init__.py",
    f"src/{project_name}/config/configuration.py",
    f"src/{project_name}/pipeline/__init__.py",
    f"src/{project_name}/entity/__init__.py",
    f"src/{project_name}/entity/config_entity.py",
    f"src/{project_name}/constants/__init__.py",
    "config/config.yaml",
    "params.yaml",
    "main.py",
    "Dockerfile",
    "setup.py",
    "research/trials.ipynb",
    "templates/index.html",
]

for filepath in list_of_files:
    filepath = Path(filepath)
    filedir, filename = os.path.split(filepath)

    if filedir !="":
        os.makedirs(filedir, exist_ok=True)
        logging.info(f"Creating directory; {filedir} for the file: {filename}")

    if (not os.path.exists(filepath)) or (os.path.getsize(filepath)==0):
        with open(filepath, "w") as f:
            pass
            logging.info(f"Creating empty file: {filepath}")

    else:
        logging.info(f"{filename} already exists")

This script ensures that all necessary files and directories are created automatically, maintaining a clean and reproducible structure for your project.


Next, we will break down the modular pipeline components in detail.

0
Subscribe to my newsletter

Read articles from Yash Maini directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Yash Maini
Yash Maini