How I Built a Role-Based GenAI Chatbot Using FastAPI and LangChain

Ujwal MahajanUjwal Mahajan
4 min read

A beginner-friendly guide to building a smart chatbot that gives different answers to different departments using RAG, FastAPI, and LangChain.

🧠 How I Built a Role-Based GenAI Chatbot Using FastAPI and LangChain

🎯 Subtitle (Optional):

A hands-on project built for the Codebasics Resume Challenge using RAG, LangChain, FastAPI, and ChromaDB to answer queries based on user roles.


🚀 Introduction

As a B.Tech CSE student interested in GenAI and real-world backend development, I recently participated in the Codebasics Resume Project Challenge — a program where students are given industry-level problem statements to build portfolio-worthy projects.

I chose the challenge:

“Build a RAG-Based Assistant to Deliver Role-Specific Insights Across Departments in a FinTech Company.”

This inspired me to create a role-based GenAI chatbot using FastAPI, LangChain, and ChromaDB, which answers user queries based on the department or job role they belong to — like Finance, HR, or Tech.

In this blog, I’ll share:

  • What I built

  • The tech stack I used

  • How I implemented role-based access in a GenAI system

  • What I learned (and struggled with)

Let’s dive in 🚀


🎯 About the Codebasics Challenge

The Codebasics Resume Project Challenge is a career-focused initiative where students and professionals build projects based on real industry use cases — with a strong focus on business understanding, clean architecture, and deployment readiness.

The project I picked was from the FinTech domain, and required building a chatbot that could:

  • Answer department-specific queries

  • Use GenAI with context retrieval (RAG)

  • Be modular, clean, and scalable

🔗 Learn more about Codebasics Projects


💡 What I Built

The goal was to build a chatbot that works like a FinTech company assistant
answering questions based on what the user’s role is (e.g., HR gets different answers than Finance).

✅ For example:

  • HR asks: “What’s the leave policy?”

  • Finance asks: “What are Q1 revenue targets?”

👉 Each gets accurate, filtered answers from their own documents.

I used Retrieval-Augmented Generation (RAG) for this — combining vector search with LLMs.


🛠️ Tech Stack

  • 🔹 FastAPI – for building the backend API

  • 🔹 LangChain – to implement the RAG pipeline

  • 🔹 ChromaDB – vector store to retrieve relevant content

  • 🔹 DeepSeek LLM (or OpenAI) – for generating final answers

  • 🔹 Pydantic – for request validation and type checking

  • 🔹 Python – for the overall logic and backend


🧠 What is RAG?

Retrieval-Augmented Generation (RAG) is a technique that enhances LLMs by grounding their answers in external knowledge.

Here's how it works:

  1. The system accepts a query (like “What’s our leave policy?”)

  2. It searches a vector database for relevant chunks of documents

  3. The top-matching chunks are passed to the LLM

  4. The LLM uses these chunks to generate an accurate, contextual answer

This ensures that answers are reliable and based on internal documents, not just the LLM’s training data.


🔐 How I Implemented Role-Based Access

Here’s how I implemented role filtering in the RAG system:

  1. Loaded company documents into the system with role metadata:

    • HR documents → role: "HR"

    • Finance docs → role: "Finance", etc.

  2. Split them into clean chunks (using a document splitter)

  3. Converted the chunks into embeddings (via HuggingFace model)

  4. Stored the embeddings in ChromaDB, along with their role tag

  5. On user query, the backend receives:

    • User’s role and query
  6. It uses a metadata-aware retriever to fetch only relevant chunks

  7. Passes those to an LLM to generate the final response

✅ Result: Different roles get filtered, context-aware answers.


🏗️ Architecture Diagram

Here’s a flow diagram of the entire system:


⚠️ Challenges I Faced

  • 🔴 Understanding metadata filtering with LangChain

  • 🔴 Structuring document chunks properly for better embedding

  • 🔴 Connecting FastAPI routes cleanly with the RAG logic

  • 🔴 Managing response formats and debugging token issues

But every issue helped me learn something new — especially around building production-level LLM workflows.


📚 What I Learned

  • ✅ How to use LangChain + ChromaDB with role-based filters

  • ✅ How to build APIs using FastAPI for AI-based applications

  • ✅ How RAG works in real-world document scenarios

  • ✅ How to think like a backend engineer while building GenAI tools


🔮 What’s Next?

Here’s what I’m currently exploring to take this project further:

  • 🧠 Connecting this chatbot to a Chrome Extension

  • 🔁 Learning LangGraph to build agent workflows

  • 📦 Learning Docker to package and deploy the app

  • 🌐 Hosting the project on a public website as a live demo



🙏 Final Thoughts

This project gave me confidence to build and deploy real-world GenAI tools.

If you're learning Python, FastAPI, or working on GenAI, I highly recommend taking part in challenges like these — they push you to go beyond the basics.

Thanks for reading! If you have questions or feedback, I’d love to connect.


0
Subscribe to my newsletter

Read articles from Ujwal Mahajan directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ujwal Mahajan
Ujwal Mahajan