Just when I thought I had mastered the world of AI APIs and async queues, I stumbled upon something that completely rewired my thinking—a tool so powerful that I genuinely felt thunderstruck.

That tool is called LangGraph. And yes, it’s an absolute masterstroke. 🎯

😩 The Pain Before LangGraph

I was building a Next.js frontend that connected to an AI backend. Everything looked good… until it was time to:

Call multiple APIs in sequence
Handle conditional logic based on user input
Manage intermediate state for each step
Optimize latency and token use

What I ended up with was a spaghetti bowl of logic and an overwhelmed frontend making 4–5 sequential API calls per user input.

I needed one clean API. One single endpoint. One unified graph.

And then — LangGraph entered the chat.

🧠 What Is LangGraph?

LangGraph is like React for AI workflows — it helps you:

Define nodes (tasks)
Define edges (transitions)
Execute a flow that moves across states

It handles complex logic like:

Branching based on conditions
Memory/state passing across steps
Clean, observable control over AI reasoning pipelines

So instead of gluing together multiple microservices or function calls, LangGraph lets you define the entire logic as a stateful, directional graph.

💡 What I Built With LangGraph

I created a LangGraph-powered agent that:

Classifies if a user query is code-related or general.
Based on the result:
- Uses GPT-4.1 for coding queries
- Uses GPT-4.1-mini for general queries (cheaper)
Validates the coding result’s accuracy (using GPT-4.1-mini).
Returns the final answer + accuracy score to the frontend.

Just one API call. One powerful workflow. Let’s break it down with the code. 🔍

💻 The Code That Tells The Story

🧱 Setup & Imports

# flake8: noqa
from typing_extensions import TypedDict
from openai import OpenAI
from typing import Literal
from dotenv import load_dotenv
from langgraph.graph import StateGraph, START, END
from pydantic import BaseModel
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware

We’re using LangGraph, OpenAI SDK, FastAPI, and some Pydantic models.
dotenv is used to load API keys.
TypedDict helps define our state object structure.

⚙️ App Initialization

load_dotenv()
client = OpenAI()
app = FastAPI()
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

We're spinning up a CORS-enabled FastAPI backend that can be accessed by any frontend (like our Next.js app).

🧾 State Definitions

📦 Models to Parse JSON

class ClassifyMessageResponse(BaseModel):
    isCodingQuestion: bool

class CodeAccuracyResponse(BaseModel):
    accuracyPercentage: str

These models define what the OpenAI response is expected to look like — structured JSON using response_format.

📊 The Core State Object

class State(TypedDict):
    user_query: str
    llm_result: str | None
    accuracyPercentage: str | None
    isCodingQuestion: str | None

This state will be passed from node to node. Each node reads and updates it.

🔍 Step-by-Step Nodes

1️⃣ Classify the Message

def classify_message(state: State):
    print("Classifying message")
    query = state['user_query']
    SYSTEM_PROMPT = """
    You are an helpfull AI Agent.
    Your job is detect if a user query is related to coding or not
    Return the response in specified JSON Boolean only.
    """

    response = client.beta.chat.completions.parse(
        model="gpt-4.1-nano",
        response_format=ClassifyMessageResponse,
        messages=[
            {"role":"system","content":SYSTEM_PROMPT},
            {"role":"user","content":query}
        ]
    )

    isCodingQuestion = response.choices[0].message.parsed.isCodingQuestion
    state["isCodingQuestion"] = isCodingQuestion
    return state

Uses GPT-4.1-nano (super cheap and fast) to detect if the query is code-related.

2️⃣ Route the Query

def route_query(state: State) -> Literal["general_query",'coding_query']:
    print("route_query")
    is_coding = state["isCodingQuestion"]
    return "coding_query" if is_coding else "general_query"

This conditional router acts like a switch-case in LangGraph and routes the flow accordingly.

3️⃣ Handle General Queries

def general_query(state: State):
    print("General Query !")
    query = state["user_query"]

    SYSTEM_PROMPT = """
    You are a helpfull and smart AI Agent, 
    You have to answer the provided user query very percisly and smartly.
    """

    response = client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[
            {"role":"system","content":SYSTEM_PROMPT},
            {"role":"user","content":query}
        ]
    )
    answer = response.choices[0].message.content
    state["llm_result"] = answer
    return state

This node answers non-coding questions using GPT-4.1-mini to save cost.

4️⃣ Handle Coding Queries

def coding_query(state: State):
    print("coding query")
    query = state["user_query"]

    SYSTEM_PROMPT = """
    You are a very precise and helpfull coding AI Assistant. 
    You are very skilled in solving the coding queries very smartly and respond the user with explainations and good quality answers.
    """

    response = client.chat.completions.create(
        model="gpt-4.1",
        messages=[
            {"role":"system","content":SYSTEM_PROMPT},
            {"role":"user","content":query}
        ]
    )

    answer = response.choices[0].message.content
    state["llm_result"] = answer
    return state

Here’s where the heavy lifting happens for coding queries — using GPT-4.1 for depth and precision.

5️⃣ Validate Coding Accuracy

def coding_validate_query(state: State):
    print("Validating message")
    query = state["user_query"]
    llm_result = state["llm_result"]
    SYSTEM_PROMPT = f"""
    You are an helpfull AI Agent.
    Your job is detect the accuracy of the coding question that is provided to you 

    User query : {query}
    Code : {llm_result}
    """

    response = client.beta.chat.completions.parse(
        model="gpt-4.1-mini",
        response_format=CodeAccuracyResponse,
        messages=[
            {"role":"system","content":SYSTEM_PROMPT},
        ]
    )

    accuracy = response.choices[0].message.parsed.accuracyPercentage
    state["accuracyPercentage"] = accuracy
    return state

This node analyzes the output quality, returning a numeric accuracy score like "97%".

💡 Idea: If accuracy < 95%, you could re-run the query using Claude or another fallback model.

🧠 Defining the LangGraph

graph_builder = StateGraph(State)

graph_builder.add_node("classify_message", classify_message)
graph_builder.add_node("route_query", route_query)
graph_builder.add_node("general_query", general_query)
graph_builder.add_node("coding_query", coding_query)
graph_builder.add_node("coding_validate_query", coding_validate_query)

graph_builder.add_edge(START, "classify_message")
graph_builder.add_conditional_edges("classify_message", route_query)

graph_builder.add_edge("general_query", END)
graph_builder.add_edge("coding_query", "coding_validate_query")
graph_builder.add_edge("coding_validate_query", END)

graph = graph_builder.compile()

This is where the magic happens:

add_node() defines the tasks
add_edge() wires up the transitions
route_query is used to dynamically branch
Everything finishes at END

🔌 FastAPI Endpoint

class Agentic(BaseModel):
    prompt: str

@app.post("/")
def main(user: Agentic):
    _state = {
        "user_query": user.prompt,
        "accuracyPercentage": None,
        "isCodingQuestion": None,
        "llm_result": None
    }
    response = graph.invoke(_state)
    return {
        "response": response["llm_result"],
        "accuracy": response["accuracyPercentage"]
    }

Now your entire complex logic is just one API call from your frontend!
Perfect for Next.js, React, Flutter, or anything else. ⚡

🧩 Ideas to Extend This

Add Claude or Mixtral as fallback if accuracy < 90%
Store query history in a database
Add retry logic or timeouts per node
Visualize graph with D3 or Mermaid