“Can an AI help me write production-ready code inside my own repo, using my structure?”
That one question led me down a rabbit hole—and out came a self-coding AI agent that understands my codebase and builds features like a lightning-fast junior developer.

In this post, I’ll walk you through how I built it using Ollama, LangChain, and a few Python tricks. If you're tired of copy-pasting AI output that doesn’t fit your project, this one’s for you.editor:Sagar Arora GPT

🚀 The Motivation

As a backend developer, I work with FastAPI, SQLAlchemy, and tools like Telnyx. Most of my time goes into:

Repeating boilerplate code (CRUD, routes, schemas)
Ensuring consistent project structure
Debugging AI output that doesn’t match my codebase

I wanted an AI that could:

Understand my repo
Follow my patterns
Write new modules automatically

And I didn’t want to hit OpenAI’s APIs for every little thing. So I turned to Ollama—a lightweight, local LLM runner.

🧰 Tools I Used

🧠 Ollama – for running LLMs locally (LLaMA3, CodeLLaMA, Phi-3, etc.)
🐍 Python – to build agent logic
🔗 LangChain – to chain tools, context, and models
🗂️ My own codebase – a working FastAPI app with clear modular structure

🧩 Architecture Overview

+----------------------------+
|     User Prompt (API)     |
+-------------+--------------+
              |
              v
+----------------------------+
|   Ollama Agent (LangChain)|
|  - Prompt Template        |
|  - Code Context Loader    |
|  - File Writer Tool       |
+-------------+--------------+
              |
              v
+----------------------------+
|  Local Repo Manipulation   |
|  (Reads/writes code files) |
+----------------------------+

My Ollama agent is powered by a LangChain agent that:

Reads relevant files from the repo
Passes the code context to the model
Uses a prompt template tailored to my stack
Writes new files directly into the repo

🧠 Making the AI Understand My Code

I created a context loader that:

Walks through the project directory
Picks important files (models/*.py, schemas/*.py, etc.)
Feeds their content as part of the prompt

To avoid token limits, I use:

Chunking (LangChain TextSplitter)
File filtering
Optionally: embedding and retrieval (RAG)

🧾 The Prompt That Made It Work

You are a coding assistant for a FastAPI backend project.

Use the following structure to generate a new CRUD module for: {{ entity_name }}

Expected Files:
- models/{{ entity_name }}.py
- schemas/{{ entity_name }}.py
- routes/{{ entity_name }}.py
- tests/test_{{ entity_name }}.py

Follow the existing style:
{{ code_context }}

Respond with valid Python code blocks only.

With this, the model didn’t hallucinate file names or invent new folder structures. It stuck to my existing format.

🧪 A Real Example

When I sent a POST request like this:

{
  "entity_name": "user"
}

My agent generated:

✅ models/user.py with a SQLAlchemy model
✅ schemas/user.py with Pydantic classes
✅ routes/user.py with FastAPI endpoints
✅ tests/test_user.py with test cases

And it perfectly matched the coding style I use across the project. No more awkward copy-paste fixes.

⚙️ Internals: Tools & Execution

Here’s what powers the whole setup:

LangChain AgentExecutor with:
- A code context retriever
- A prompt template tool
- A file writer tool
Ollama LLMs:
- I experimented with codellama:13b, llama3, and phi3
- Codellama worked best for structured code tasks
Filesystem Tool:
- Saves each code block as a new Python file
- Adds it into the correct path in the repo

💡 What I Learned

Local LLMs like CodeLLaMA can seriously reduce your cloud dependency
Adding your own tools (like file writers) gives you control and flexibility
Combining LLMs with context-aware design = production-grade AI automation
For larger projects, RAG with FAISS or Chroma will help a lot

🔜 What’s Next?

Here’s what I’m planning to build on top of this:

✅ Git auto-commit for all agent-generated files
✅ Code test runner after generation
✅ YAML config to guide agent behavior (naming, folders)
✅ A web UI with file preview + approval workflow
🧪 Prompt optimization for different types of modules (jobs, services, handlers)

🎯 Final Thoughts

This project made me realize that LLMs aren’t just code helpers—they can be active contributors to your development process.

If you already have a working codebase and want to save hours of repetitive dev work, give Ollama + LangChain a try. Build your own repo-native AI engineer.

It feels like magic—only faster.

✍️ Written by Sagar Arora GPT
Follow me for more experiments on AI tooling, dev automation, and self-coding agents.

🧠 How I Built My First Self-Coding AI Agent Using Ollama