🧩Building Multi-Agent LLM Systems for Document Processing in the Enterprise

Table of contents
- ⚡ Introduction
- Use Case Overview: Legal Contract Review
- 🧠Step 1: Setting Up Agents
- Step 3: FastAPI Interface
- Step 4: Dockerfile for Deployment
- 🌐Scaling: Adding Compliance & Summary Agents
- 🧩Persistence and State Management
- 🧪 Testing Strategy
- 🧰 CI/CD Automation with GitHub Actions
- 📈 Observability and Monitoring
- 🤝 Collaboration and Knowledge Sharing
- 💬 Connect With Me
⚡ Introduction
Enterprises are overwhelmed with document-heavy workflows: invoices, contracts, reports, RFPs. These documents are often unstructured or semi-structured, requiring intelligent parsing, classification, and decision-making.
With the advent of Large Language Models (LLMs), we now have the opportunity to build smart, automated workflows that go beyond basic OCR and NLP pipelines. But a single LLM prompt isn't enough for enterprise-grade reliability.
That's where multi-agent systems shine — they allow us to decompose complex tasks into specialized roles, enabling modularity, resilience, and scalability.
In this article, we'll walk through how to design and implement a production-grade multi-agent LLM system for legal contract review , using LangChain , FastAPI , and modern DevOps practices.
Use Case Overview: Legal Contract Review
We'll simulate a contract ingestion pipeline that:
Classifies the type of contract (NDA, Employment, Vendor, etc.)
Extracts key clauses (e.g., termination, payment terms)
Checks compliance (e.g., presence of required clauses)
Generates a summary for the legal team
Each step is handled by a dedicated agent, communicating through a controller — forming a modular, testable, and scalable architecture .
🧠Step 1: Setting Up Agents
ClassifierAgent
from langchain_openai import ChatOpenAI
class ClassifierAgent:
def __init__(self, llm: ChatOpenAI):
self.llm = llm
def classify(self, doc_text: str) -> str:
prompt = f"""
What type of legal document is this?
Options: NDA, Employment Contract, Vendor Agreement, Other
Document:
{doc_text[:1000]}
"""
response = self.llm.invoke(prompt).content.strip()
return response
ClauseExtractorAgent
class ClauseExtractorAgent:
def __init__(self, llm: ChatOpenAI):
self.llm = llm
def extract(self, doc_text: str) -> dict:
prompt = f"""
Extract the following clauses: Termination, Payment Terms, Jurisdiction.
Return as JSON.
Document:
{doc_text[:1500]}
"""
response = self.llm.invoke(prompt).content.strip()
return self._parse_json(response)
def _parse_json(self, raw: str) -> dict:
# Add robust error handling here
try:
return json.loads(raw)
except json.JSONDecodeError:
raise ValueError("Failed to parse clause extraction output.")
Step 2: Creating the Agent Controller
class DocumentProcessingController:
def __init__(self, agents: dict):
self.classifier = agents['classifier']
self.extractor = agents['extractor']
def run(self, doc_text: str) -> dict:
try:
doc_type = self.classifier.classify(doc_text)
clauses = self.extractor.extract(doc_text)
return {
"doc_type": doc_type,
"clauses": clauses
}
except Exception as e:
# Log and rethrow for centralized error handling
logger.error(f"Pipeline failed: {str(e)}")
raise
Step 3: FastAPI Interface
from fastapi import FastAPI, UploadFile, HTTPException
from pydantic import BaseModel
import logging
app = FastAPI()
logger = logging.getLogger(__name__)
class DocumentResponse(BaseModel):
doc_type: str
clauses: dict
@app.post("/process", response_model=DocumentResponse)
async def process_document(file: UploadFile):
if not file.filename.endswith(('.pdf', '.docx', '.txt')):
raise HTTPException(status_code=400, detail="Unsupported file format")
try:
text = await file.read()
result = controller.run(text.decode('utf-8'))
return result
except Exception as e:
logger.exception("Document processing failed")
raise HTTPException(status_code=500, detail=f"Internal server error: {str(e)}")
Step 4: Dockerfile for Deployment
FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8080
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8080"]
🌐Scaling: Adding Compliance & Summary Agents
ComplianceAgent
class ComplianceAgent:
def check(self, clauses: dict) -> dict:
prompt = f"""
Check if the following clauses meet company policy:
{json.dumps(clauses)}
Return missing clauses or warnings.
"""
response = self.llm.invoke(prompt).content.strip()
return self._parse_compliance_result(response)
def _parse_compliance_result(self, raw: str) -> dict:
# Implement structured parsing logic
...
SummarizerAgent
class SummarizerAgent:
def summarize(self, doc_text: str) -> str:
prompt = f"""
Provide a concise summary of the legal document below (max 200 words):
{doc_text[:3000]}
"""
return self.llm.invoke(prompt).content.strip()
🧩Persistence and State Management
Use MongoDB or PostgreSQL to store:
Processed documents
Agent outputs
Audit logs
User feedback
Example schema for storing results:
{
"document_id": "uuid",
"filename": "nda_contract_v1.pdf",
"timestamp": "2025-04-05T14:30:00Z",
"classification": "NDA",
"clauses": {
"termination": "...",
"jurisdiction": "..."
},
"compliance_status": "partial",
"summary": "..."
}
🧪 Testing Strategy
Unit Test Example (pytest)
def test_classifier_agent():
mock_llm = MockLLM(response="NDA")
agent = ClassifierAgent(mock_llm)
result = agent.classify("This agreement is between two parties...")
assert result == "NDA"
Integration Test
def test_full_pipeline():
response = client.post("/process", files={"file": ("test.txt", b"This is an NDA...")})
assert response.status_code == 200
assert response.json()["doc_type"] == "NDA"
🧰 CI/CD Automation with GitHub Actions
Add a .github/workflows/deploy.yml
file:
name: Deploy Contract Processor
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: pip install -r requirements.txt
- name: Run tests
run: pytest
- name: Build Docker image
run: docker build -t contract-processor .
- name: Push to container registry
run: |
docker tag contract-processor your-registry/contract-processor:latest
docker push your-registry/contract-processor:latest
📈 Observability and Monitoring
Structured Logging : Use
structlog
orloguru
for searchable logs.Metrics : Expose Prometheus metrics for latency, success rates, and error counts.
Tracing : Integrate with Jaeger or OpenTelemetry for tracing agent interactions.
Alerting : Use Grafana or Datadog to monitor performance and failures.
🤝 Collaboration and Knowledge Sharing
A production system should be built to scale across teams:
Modular Codebase : Each agent lives in its own module (
agents/
classifier.py
, etc.)Shared SDKs : Create reusable libraries for common LLM patterns.
Documentation : Write clear docstrings and use Swagger UI for API docs.
Prompt Versioning : Store prompts in a version-controlled config system.
Building a multi-agent LLM system for document processing in the enterprise is no small task. It requires a blend of AI expertise, software engineering rigor, and operational know-how.
But by structuring your code modularly, writing tests, automating deployments, and adding observability, you're showing the world that you can build real-world, production-grade AI applications .
This kind of project will not only help automate business processes but also stand out on your GitHub profile when applying for ML, AI, or backend engineering roles.
💬 Connect With Me
If you found this helpful, leave a comment or connect with me on:
Subscribe to my newsletter
Read articles from JNAYEH Sirine directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
