Building a Simple AI Agent with Chain-of-Thought Prompting (No Framework Required)


In this article, we’ll walk through creating a simple AI agent that uses the Chain-of-Thought (CoT) technique to reason and solve queries step-by-step. The goal is to understand how agents work under the hood without depending on external frameworks.
We'll build an agent that can respond to weather-related questions using a public weather API.
1. Initial Setup
Create a Project Directory
mkdir rag_chatbot
cd rag_chatbot
Create a Virtual Environment (for Windows)
python -m venv .venv
.\.venv\Scripts\activate
For macOS/Linux, use:
source .venv/bin/activate
Install Required Packages
pip install openai python-dotenv requests
Create a .env
File
Create a .env
file in the root directory and add your API key:
API_KEY_GEMINI = "your_api_key"
You can get a free Gemini API key from: https://aistudio.google.com/apikey
2. Writing the Agent Code with Chain-of-Thought Prompting
Now, let's break down the agent code into clear parts and explain what each does.
Step 1: Import Required Libraries
create a file simple_agent.py
import these packages
import json
import requests
from openai import OpenAI
from dotenv import load_dotenv
import os
json
andrequests
help us handle HTTP requests and responses.OpenAI
is the official client used to access Gemini through its OpenAI-compatible API.dotenv
is used to load environment variables like the API key.
Step 2: Load Environment Variables
load_dotenv()
This loads the .env
file into your environment so we can use os.getenv
.
Step 3: Initialize the API Client
client = OpenAI(
api_key=os.getenv('API_KEY_GEMINI'),
base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
We configure the OpenAI client to work with the Gemini-compatible API from Google.
Step 4: Define Tools
This is our simulated "tool" the agent can use to get weather information.
def get_weather(city: str):
url = f"https://wttr.in/{city}?format=%C+%t"
response = requests.get(url)
if response.status_code == 200:
return f"The weather in {city} is {response.text}."
return "Something went wrong"
We also create a tool registry:
avaiable_tools = {
"get_weather": {
"fn": get_weather,
"description": "Takes a city name as an input and returns the current weather for the city"
}
}
This allows our agent to "know" which tools are available and how to use them.
Step 5: Define the System Prompt
Chain-of-thought prompting relies on detailed planning. This prompt defines the step-by-step reasoning structure the model should follow.
system_prompt = f"""
You are a helpful AI assistant specialized in resolving user queries using reasoning.
You operate in the following steps: start → plan → action → observe → output.
Based on the user query and available tools, reason step-by-step to choose the right action.
Rules:
- Output must be a single JSON object.
- Always perform only one step per response.
- Only take an action after planning.
- Wait for observation before final answer.
JSON Format:
{{
"step": "plan" | "action" | "observe" | "output",
"content": "description of the reasoning",
"function": "name of function if step is action",
"input": "input to function if step is action"
}}
Available Tools:
{json.dumps({k: v["description"] for k, v in available_tools.items()}, indent=2)}
Example for query "What is the weather in New York?":
1. {{ "step": "plan", "content": "The user asked about weather in New York" }}
2. {{ "step": "plan", "content": "I will use the get_weather tool" }}
3. {{ "step": "action", "function": "get_weather", "input": "New York" }}
4. {{ "step": "observe", "content": "It is sunny and 20°C" }}
5. {{ "step": "output", "content": "The weather in New York is sunny and 20°C." }}
"""
Step 6: Message Loop (Agent Execution)
messages = [
{ "role": "system", "content": system_prompt }
]
We start the chat with our system prompt. Now the main loop:
while True:
user_query = input('user :> ')
messages.append({ "role": "user", "content": user_query })
while True:
response = client.chat.completions.create(
model="gemini-2.0-flash",
response_format={"type": "json_object"},
messages=messages
)
parsed_output = json.loads(response.choices[0].message.content)
messages.append({ "role": "assistant", "content": json.dumps(parsed_output) })
Here:
We send the user input and current message history to Gemini
We parse the structured JSON response
We append the agent's step-by-step reply to the conversation
Step 7: Handle Steps
if parsed_output.get("step") == "plan":
print(f"Planning: {parsed_output.get('content')}")
continue
When the model is thinking, we just display its internal reasoning.
if parsed_output.get("step") == "action":
tool_name = parsed_output.get("function")
tool_input = parsed_output.get("input")
if avaiable_tools.get(tool_name, False):
output = avaiable_tools[tool_name]["fn"](tool_input)
messages.append({ "role": "assistant", "content": json.dumps({ "step": "observe", "output": output }) })
continue
If the agent wants to act, we run the tool and append the observation back to the model.
if parsed_output.get("step") == "output":
print(f"Assistant: {parsed_output.get('content')}")
break
Finally, once the model provides the answer, we display it.
3. Complete Code
import json
import requests
from openai import OpenAI
from dotenv import load_dotenv
import os
# Load environment variables
load_dotenv()
# Set up OpenAI client for Gemini-compatible endpoint
client = OpenAI(
api_key=os.getenv('API_KEY_GEMINI'),
base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
# Define tool function
def get_weather(city: str):
print(f"Tool Called: get_weather({city})")
url = f"https://wttr.in/{city}?format=%C+%t"
response = requests.get(url)
if response.status_code == 200:
return f"The weather in {city} is {response.text.strip()}."
return "Something went wrong"
# Tools dictionary
available_tools = {
"get_weather": {
"fn": get_weather,
"description": "Takes a city name as input and returns current weather"
}
}
# System Prompt using Chain of Thought-style reasoning
system_prompt = f"""
You are a helpful AI assistant specialized in resolving user queries using reasoning.
You operate in the following steps: start → plan → action → observe → output.
Based on the user query and available tools, reason step-by-step to choose the right action.
Rules:
- Output must be a single JSON object.
- Always perform only one step per response.
- Only take an action after planning.
- Wait for observation before final answer.
JSON Format:
{{
"step": "plan" | "action" | "observe" | "output",
"content": "description of the reasoning",
"function": "name of function if step is action",
"input": "input to function if step is action"
}}
Available Tools:
{json.dumps({k: v["description"] for k, v in available_tools.items()}, indent=2)}
Example for query "What is the weather in New York?":
1. {{ "step": "plan", "content": "The user asked about weather in New York" }}
2. {{ "step": "plan", "content": "I will use the get_weather tool" }}
3. {{ "step": "action", "function": "get_weather", "input": "New York" }}
4. {{ "step": "observe", "content": "It is sunny and 20°C" }}
5. {{ "step": "output", "content": "The weather in New York is sunny and 20°C." }}
"""
# Initialize message history
messages = [
{ "role": "system", "content": system_prompt }
]
# chat loop
while True:
user_query = input("User: ")
messages.append({ "role": "user", "content": user_query })
while True:
# Call Gemini model
response = client.chat.completions.create(
model="gemini-2.0-flash",
response_format={"type": "json_object"},
messages=messages
)
parsed_output = json.loads(response.choices[0].message.content)
messages.append({ "role": "assistant", "content": json.dumps(parsed_output) })
step = parsed_output.get("step")
if step == "plan":
print(f"Thought: {parsed_output['content']}")
continue
if step == "action":
tool_name = parsed_output.get("function")
tool_input = parsed_output.get("input")
if tool_name in available_tools:
tool_output = available_tools[tool_name]["fn"](tool_input)
observation = {
"step": "observe",
"content": tool_output
}
messages.append({ "role": "assistant", "content": json.dumps(observation) })
continue
if step == "output":
print(f"Assistant: {parsed_output['content']}")
break
4. Sample Run
run python simple_agent.py
in terminal
User: Hi
Thought: The user greeted me. I should respond in a helpful way.
Assistant: Hi there! How can I help you today?
User: what is the weather in pune ?
Thought: The user is asking about the weather in Pune. I should use the get_weather tool to find the weather.
Tool Called: get_weather(Pune)
Assistant: The weather in Pune is Light rain shower and 25°C.
5. Summary
This project demonstrated how to:
Build an agent from scratch using the Gemini API
Use Chain-of-Thought prompting for reasoning
Integrate external tools (like weather APIs)
Orchestrate a simple decision-action-observation loop
All without using any frameworks.
Github Link :
Subscribe to my newsletter
Read articles from Sandip Deshmukh directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
