Building a Simple AI Agent with Chain-of-Thought Prompting (No Framework Required)

Sandip DeshmukhSandip Deshmukh
6 min read

In this article, we’ll walk through creating a simple AI agent that uses the Chain-of-Thought (CoT) technique to reason and solve queries step-by-step. The goal is to understand how agents work under the hood without depending on external frameworks.

We'll build an agent that can respond to weather-related questions using a public weather API.


1. Initial Setup

Create a Project Directory

mkdir rag_chatbot
cd rag_chatbot

Create a Virtual Environment (for Windows)

python -m venv .venv
.\.venv\Scripts\activate

For macOS/Linux, use:

source .venv/bin/activate

Install Required Packages

pip install openai python-dotenv requests

Create a .env File

Create a .env file in the root directory and add your API key:

API_KEY_GEMINI = "your_api_key"

You can get a free Gemini API key from: https://aistudio.google.com/apikey


2. Writing the Agent Code with Chain-of-Thought Prompting

Now, let's break down the agent code into clear parts and explain what each does.

Step 1: Import Required Libraries

create a file simple_agent.py

import these packages

import json
import requests
from openai import OpenAI
from dotenv import load_dotenv
import os
  • json and requests help us handle HTTP requests and responses.

  • OpenAI is the official client used to access Gemini through its OpenAI-compatible API.

  • dotenv is used to load environment variables like the API key.


Step 2: Load Environment Variables

load_dotenv()

This loads the .env file into your environment so we can use os.getenv.


Step 3: Initialize the API Client

client = OpenAI(
    api_key=os.getenv('API_KEY_GEMINI'),
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)

We configure the OpenAI client to work with the Gemini-compatible API from Google.


Step 4: Define Tools

This is our simulated "tool" the agent can use to get weather information.

def get_weather(city: str):
    url = f"https://wttr.in/{city}?format=%C+%t"
    response = requests.get(url)

    if response.status_code == 200:
        return f"The weather in {city} is {response.text}."
    return "Something went wrong"

We also create a tool registry:

avaiable_tools = {
    "get_weather": {
        "fn": get_weather,
        "description": "Takes a city name as an input and returns the current weather for the city"
    }
}

This allows our agent to "know" which tools are available and how to use them.


Step 5: Define the System Prompt

Chain-of-thought prompting relies on detailed planning. This prompt defines the step-by-step reasoning structure the model should follow.

system_prompt = f"""
You are a helpful AI assistant specialized in resolving user queries using reasoning.
You operate in the following steps: start → plan → action → observe → output.
Based on the user query and available tools, reason step-by-step to choose the right action.

Rules:
- Output must be a single JSON object.
- Always perform only one step per response.
- Only take an action after planning.
- Wait for observation before final answer.

JSON Format:
{{
    "step": "plan" | "action" | "observe" | "output",
    "content": "description of the reasoning",
    "function": "name of function if step is action",
    "input": "input to function if step is action"
}}

Available Tools:
{json.dumps({k: v["description"] for k, v in available_tools.items()}, indent=2)}

Example for query "What is the weather in New York?":
1. {{ "step": "plan", "content": "The user asked about weather in New York" }}
2. {{ "step": "plan", "content": "I will use the get_weather tool" }}
3. {{ "step": "action", "function": "get_weather", "input": "New York" }}
4. {{ "step": "observe", "content": "It is sunny and 20°C" }}
5. {{ "step": "output", "content": "The weather in New York is sunny and 20°C." }}
"""

Step 6: Message Loop (Agent Execution)

messages = [
    { "role": "system", "content": system_prompt }
]

We start the chat with our system prompt. Now the main loop:

while True:
    user_query = input('user :> ')
    messages.append({ "role": "user", "content": user_query })

    while True:
        response = client.chat.completions.create(
            model="gemini-2.0-flash",
            response_format={"type": "json_object"},
            messages=messages
        )

        parsed_output = json.loads(response.choices[0].message.content)
        messages.append({ "role": "assistant", "content": json.dumps(parsed_output) })

Here:

  • We send the user input and current message history to Gemini

  • We parse the structured JSON response

  • We append the agent's step-by-step reply to the conversation


Step 7: Handle Steps

if parsed_output.get("step") == "plan":
            print(f"Planning: {parsed_output.get('content')}")
            continue

When the model is thinking, we just display its internal reasoning.

if parsed_output.get("step") == "action":
            tool_name = parsed_output.get("function")
            tool_input = parsed_output.get("input")

            if avaiable_tools.get(tool_name, False):
                output = avaiable_tools[tool_name]["fn"](tool_input)
                messages.append({ "role": "assistant", "content": json.dumps({ "step": "observe", "output": output }) })
                continue

If the agent wants to act, we run the tool and append the observation back to the model.

if parsed_output.get("step") == "output":
            print(f"Assistant: {parsed_output.get('content')}")
            break

Finally, once the model provides the answer, we display it.


3. Complete Code

import json
import requests
from openai import OpenAI
from dotenv import load_dotenv
import os

# Load environment variables
load_dotenv()

# Set up OpenAI client for Gemini-compatible endpoint
client = OpenAI(
    api_key=os.getenv('API_KEY_GEMINI'),
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)

# Define tool function
def get_weather(city: str):
    print(f"Tool Called: get_weather({city})")
    url = f"https://wttr.in/{city}?format=%C+%t"
    response = requests.get(url)

    if response.status_code == 200:
        return f"The weather in {city} is {response.text.strip()}."
    return "Something went wrong"

# Tools dictionary
available_tools = {
    "get_weather": {
        "fn": get_weather,
        "description": "Takes a city name as input and returns current weather"
    }
}

# System Prompt using Chain of Thought-style reasoning
system_prompt = f"""
You are a helpful AI assistant specialized in resolving user queries using reasoning.
You operate in the following steps: start → plan → action → observe → output.
Based on the user query and available tools, reason step-by-step to choose the right action.

Rules:
- Output must be a single JSON object.
- Always perform only one step per response.
- Only take an action after planning.
- Wait for observation before final answer.

JSON Format:
{{
    "step": "plan" | "action" | "observe" | "output",
    "content": "description of the reasoning",
    "function": "name of function if step is action",
    "input": "input to function if step is action"
}}

Available Tools:
{json.dumps({k: v["description"] for k, v in available_tools.items()}, indent=2)}

Example for query "What is the weather in New York?":
1. {{ "step": "plan", "content": "The user asked about weather in New York" }}
2. {{ "step": "plan", "content": "I will use the get_weather tool" }}
3. {{ "step": "action", "function": "get_weather", "input": "New York" }}
4. {{ "step": "observe", "content": "It is sunny and 20°C" }}
5. {{ "step": "output", "content": "The weather in New York is sunny and 20°C." }}
"""

# Initialize message history
messages = [
    { "role": "system", "content": system_prompt }
]

# chat loop
while True:
    user_query = input("User: ")
    messages.append({ "role": "user", "content": user_query })

    while True:
        # Call Gemini model
        response = client.chat.completions.create(
            model="gemini-2.0-flash",
            response_format={"type": "json_object"},
            messages=messages
        )

        parsed_output = json.loads(response.choices[0].message.content)
        messages.append({ "role": "assistant", "content": json.dumps(parsed_output) })

        step = parsed_output.get("step")

        if step == "plan":
            print(f"Thought: {parsed_output['content']}")
            continue

        if step == "action":
            tool_name = parsed_output.get("function")
            tool_input = parsed_output.get("input")

            if tool_name in available_tools:
                tool_output = available_tools[tool_name]["fn"](tool_input)
                observation = {
                    "step": "observe",
                    "content": tool_output
                }
                messages.append({ "role": "assistant", "content": json.dumps(observation) })
                continue

        if step == "output":
            print(f"Assistant: {parsed_output['content']}")
            break

4. Sample Run

run python simple_agent.py in terminal

User: Hi      
Thought: The user greeted me. I should respond in a helpful way.
Assistant: Hi there! How can I help you today?
User: what is the weather in pune ?
Thought: The user is asking about the weather in Pune. I should use the get_weather tool to find the weather.
Tool Called: get_weather(Pune)
Assistant: The weather in Pune is Light rain shower and 25°C.

5. Summary

This project demonstrated how to:

  • Build an agent from scratch using the Gemini API

  • Use Chain-of-Thought prompting for reasoning

  • Integrate external tools (like weather APIs)

  • Orchestrate a simple decision-action-observation loop

All without using any frameworks.

https://github.com/sandipdeshmukh77/simple-agent

0
Subscribe to my newsletter

Read articles from Sandip Deshmukh directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Sandip Deshmukh
Sandip Deshmukh