Streamline Agent Workflows with LangGraph

The Problem

Let’s build an agentic workflow based on the following use case:

Workflow Description:

In this workflow, the user submits a query which is first analyzed to determine whether it's related to coding. For this initial classification, we use GPT-4.1-nano, as it's both fast and cost-efficient.

If the query is related to coding, we pass it to GPT-4.1, which is well-suited for handling complex computational tasks.
The result, along with the original user query, is then sent again to GPT-4.1 to validate the generated code's accuracy.
If the accuracy is above 95%, we display the code to the user.
If not, the system retries to generate a more accurate version until the threshold is met.
If the query is general (non-coding-related), it is processed using GPT-4.1-mini, and the generated response is directly returned to the user.

Code for the workflow described above

Although this workflow is relatively simple—with only a few steps and limited branching—it’s still manageable to write and follow. However, imagine a scenario where there are 20 or more nodes. The code would quickly become messy, filled with nested if-else blocks, repeated logic, and poor readability.

This is exactly where LangGraph shines. It abstracts away the low-level control flow logic by letting developers focus solely on defining nodes. These nodes can even be reused or shared independently. LangGraph then compiles these into a stateful graph and handles the flow for you.

def classify_query(query: str) -> bool:
    # Returns True if it's a coding query, otherwise False using GPT-4.1-nano
    ...

def general_query(query: str) -> str:
    # Handle general queries using GPT-4.1-mini
    ...

def coding_query(query: str) -> str:
    # Generate code using GPT-4.1
    ...

def coding_validate_query(query: str, code: str) -> float:
    # Validate code using GPT-4.1 and return accuracy as a float
    ...

query = input("> ")

is_coding_query = classify_query(query)

if not is_coding_query:
    print(f"Bot: {general_query(query)}")
else:
    while True:
        model_result = coding_query(query)
        accuracy = coding_validate_query(query, model_result)
        if accuracy < 95:
            continue
        else:
            print(f"Bot: {model_result} with accuracy {accuracy}%")
            break

LangGraph

Before building the workflow, it’s important to understand that every graph (or workflow) in this system operates on a state.

When the graph starts, this state is passed into the graph, and it flows through each node. Each node represents a unit of logic—typically an API call to an LLM—and is responsible for updating the state with its results.

Meanwhile, edges define how the state transitions between nodes. They act as paths that determine which node comes next, based on logic, conditions, or simple sequencing.

It’s also important to note that nodes always return a state, unless the node is specifically a routing node.

Standard nodes (like those making LLM calls or performing transformations) take in the current state, perform some logic, and return an updated version of that state.
Routing nodes, on the other hand, don’t return a modified state. Instead, they return a string or condition that determines the next path (i.e., which edge to follow next).

This distinction allows LangGraph to separate data processing from control flow, making the workflow both modular and easy to reason about.

Every LangGraph workflow has two special nodes: START and END.

START: This is the entry point of the graph. It defines where the execution begins. You typically connect it to the first functional node (like a classifier or input handler).
END: This is the termination point of the graph. Once the state reaches this node, the workflow completes, and the final state is returned to the caller.

These two nodes help define the execution boundary of the workflow. They make it easy to structure and visualize the beginning and end of an agent's lifecycle.

Now, let's build the same workflow using LangGraph. The workflow will change slightly and look like this based on the information above.

Code for the agent workflow using LangGraph

from typing import Literal
from dotenv import load_dotenv
from typing_extensions import TypedDict
from openai import OpenAI
from pydantic import BaseModel
from langgraph.graph import START, StateGraph, END

Here we are importing the required modules. Standard typing utilities like TypedDict and Literal are used to define the structure and control flow of the shared state, while pydantic.BaseModel helps enforce structured responses from the LLMs. The dotenv package is used to load environment variables—such as the OpenAI API key—from a .env file.

load_dotenv()
client = OpenAI()

After loading the environment, we create an instance of the OpenAI client, which is used throughout the workflow to call different models like gpt-4.1-nano, gpt-4.1, and gpt-4.1-mini.

# These helps to control the model to return a STRUCTURE RESPONSE.
# We can pass this to the model client so that we get the output from the model like the below classes.
# pydantic in python is zod in typescript
class ClassifyMessageResponse(BaseModel):
    is_coding_problem: bool


class CodeAccuracyResponse(BaseModel):
    accuracy_percentage: int

To ensure that the models return well-structured and parsable outputs, two Pydantic classes are defined: ClassifyMessageResponse for determining whether a query is related to coding, and CodeAccuracyResponse for capturing the accuracy score of the generated code. These act like runtime schemas, similar to how zod is used in TypeScript.

class State(TypedDict):
    query: str
    llm_result: str | None
    accuracy_percentage: str | None
    is_coding_problem: bool | None

Next, we define the State, a TypedDict that serves as the shared memory passed between nodes in the workflow. It contains keys like query, llm_result, accuracy_percentage, and is_coding_problem, which each node can update as needed.

Next, we will create all the functional and routing nodes. Each node accepts the state as a parameter and returns the state if it is a functional or standard node, or a string (the name of the node) if it is a routing node.

def classify_query(state: State):
    print("⚠️ classify_query")

    query = state["query"]
    CLASSIFY_SYSTEM_PROMPT = """
        You are an AI Assistant. Your job is to detect if the user's query is related to coding or not.
        Return the response in boolean. True for coding query and False for general qurey.
    """
    res = client.beta.chat.completions.parse(
        model="gpt-4.1-nano",
        response_format=ClassifyMessageResponse,
        messages=[
            {"role": "system", "content": CLASSIFY_SYSTEM_PROMPT},
            {"role": "user", "content": query},
        ],
    )
    is_coding_problem = res.choices[0].message.parsed.is_coding_problem
    state["is_coding_problem"] = is_coding_problem
    return state

classify_query, uses the gpt-4.1-nano model to quickly and cost-effectively determine if the user’s query is related to coding. It update value of is_coding_problem parameter key in the state.

def route_query(state: State) -> Literal["general_query", "coding_query"]:
    is_coding_problem = state["is_coding_problem"]
    if is_coding_problem:
        return "coding_query"

    return "general_query"

Check the is_coding_problem is true or not. If true it returns the name of the node(Below you will see the how we are giving the name). By this route node we can make more complex workflows.

def coding_query(state: State):
    print("⚠️ coding_query")

    query = state["query"]

    CODING_SYSTEM_PROMPT = """
        You are a Coding Expert Agent
    """
    res = client.chat.completions.create(
        model="gpt-4.1",
        messages=[
            {"role": "system", "content": CODING_SYSTEM_PROMPT},
            {"role": "user", "content": query},
        ],
    )
    llm_result = res.choices[0].message.content
    state["llm_result"] = llm_result
    return state

Uses gpt-4.1 to generate a code solution for the given query. Stores the result in llm_result of the state.

def general_query(state: State):
    print("⚠️ general_query")

    query = state["query"]
    res = client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[
            {"role": "user", "content": query},
        ],
    )
    llm_result = res.choices[0].message.content
    state["llm_result"] = llm_result
    return state

For non-coding queries, this node uses gpt-4.1-mini to generate a general answer.

def coding_validate_query(state: State):
    print("⚠️ coding_validate_query")

    query = state["query"]
    llm_result = state["llm_result"]
    SYSTEM_PROMPT = f"""
        You are expert in calculating accuracy of the code according to the question.
        Return the percentage of accuracy between 0 to 100.

        User Query: {query}
        Code: {llm_result}
    """
    res = client.beta.chat.completions.parse(
        model="gpt-4.1",
        response_format=CodeAccuracyResponse,
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": query},
        ],
    )
    accuracy_percentage = res.choices[0].message.parsed.accuracy_percentage
    print("Accuracy from function: ", accuracy_percentage)
    state["accuracy_percentage"] = accuracy_percentage
    return state

Takes the code generated and asks GPT-4.1 to evaluate its accuracy against the original query.

Saves the accuracy_percentage in the state as int.

def route_code_validate_query(state: State) -> Literal[END, "coding_query"]:
    accuracy = state["accuracy_percentage"]
    if accuracy < 95:
        return "coding_query"
    return END

Checks whether the accuracy is < 95. If yes, loops back to coding_query to regenerate and If no, moves to END.

graph_builder = StateGraph(State)

Initializes a new graph where each node operates on a shared state defined by the State. It sets up the structure for building and connecting workflow nodes that will process and update this state.

# Defining Nodes
graph_builder.add_node("classify_query", classify_query)
graph_builder.add_node("route_query", route_query)
graph_builder.add_node("coding_query", coding_query)
graph_builder.add_node("general_query", general_query)
graph_builder.add_node("coding_validate_query", coding_validate_query)
graph_builder.add_node("route_code_validate_query", route_code_validate_query)

Defining all the nodes that need to be in the graph. The add_node() method takes the name of the node and the action or function that the node is supposed to perform.

# Defining Edges
graph_builder.add_edge(START, "classify_query")
graph_builder.add_conditional_edges("classify_query", route_query)
graph_builder.add_edge("general_query", END)
graph_builder.add_edge("coding_query", "coding_validate_query")
graph_builder.add_conditional_edges("coding_validate_query", route_code_validate_query)

Defining all the edges that connect the nodes and specify the flow of direction to achieve the desired result.

The add_edge() method takes two arguments: the source and the destination. The add_conditional_edges() method is used for routing nodes, where the first argument is the source, and the second is the routing node function or action.

# Compile the graph
graph = graph_builder.compile()

Once the graph is fully defined, it is compiled into an executable object using graph_builder.compile().

def main():
    user_query = input("> ")

    state:  State = {
        "query": user_query,
        "llm_result": None,
        "accuracy_percentage": None,
        "is_coding_problem": False
    }

    response = graph.invoke(state)

    print(f"Bot: {response}")

main()

Here, we take the user_query and create a state object. To run the entire agentic workflow, we use graph.invoke(state) to get the result.

# Output
(venv) ➜  06-langgraph python graph.py
> Hi, my name is Karan
⚠️ classify_query
⚠️ general_query
Bot: {'query': 'Hi, my name is Karan', 'llm_result': 'Hello Karan! How can I assist you today?', 'accuracy_percentage': None, 'is_coding_problem': False}

(venv) ➜  06-langgraph python graph.py
> Write a Python function to remove duplicates from a list using only a set
⚠️ classify_query
⚠️ coding_query
⚠️ coding_validate_query
Accuracy from function:  100
Bot: {'query': 'Write a Python function to remove duplicates from a list using only a set', 'llm_result': 'Absolutely! If you want to **remove duplicates from a list** using only a `set`, and preserve no specific order, you can simply do:\n\n```python\ndef remove_duplicates(lst):\n    return list(set(lst))\n```\n\nThis converts the list to a set (which removes duplicates) and then back to a list.\n\n## Example:\n```python\ndata = [1, 2, 2, 3, 4, 4, 5]\nunique = remove_duplicates(data)\nprint(unique)  # Output could be: [1, 2, 3, 4, 5] (order not guaranteed)\n```\n\n**Note:**  \nIf you need to **preserve the original order** while using a set to check for duplicates (which you said "using only a set"; so we\'ll avoid `dict` or other standard approaches):\n\n```python\ndef remove_duplicates(lst):\n    seen = set()\n    result = []\n    for item in lst:\n        if item not in seen:\n            seen.add(item)\n            result.append(item)\n    return result\n```\n\nBut the first, shortest solution above is usually what is meant if order does not matter!', 'accuracy_percentage': 100, 'is_coding_problem': True}

I’ve published the complete code for this agentic workflow—both with and without LangGraph—as GitHub Gists. You can explore them side by side to better understand the structural differences and benefits of using LangGraph.

👉 View the Gists here

These examples should help reinforce the concepts discussed and serve as a starting point for building your own agentic systems.

Wrapping Up

Defined a simple agentic workflow that classifies user queries and handles them differently based on type.
Compared imperative (if-else) logic vs. a graph-based approach to highlight complexity scaling.
Introduced StateGraph to structure workflows using nodes and edges while managing shared state.
Used routing nodes to control flow, including conditional branching and retry loops.
Demonstrated how LangGraph simplifies logic, improves modularity, and makes workflows easier to maintain.

From If-Else Chaos to Clean Graphs: Mastering Agent Workflows with LangGraph

Table of contents