What are AI Agents?

balu sekharbalu sekhar
6 min read

A large language model (LLM) is like a super-intelligent brain sitting in a jar. It knows a lot. It can reason, think, and even sound brilliant. But ask it to send an email, book a meeting, write code and run it, or search the web on its own, it just sits there.

User: Can you send an email to John?
AI: I cannot do that.

This is where AI agents come in.

Introduction:

In the world of AI, we already state of art Large Language Models, but they only can generate, but they do not perform any task.

An AI agent:

  • Knows its goal

  • Plans how to get there

  • And does it all without micromanagement.

This is possible by an AI agent because of Tools. Let us discuss deeper.

High Level Working of an AI Agent:

Let us understand this with the help of an example:

Example:
User: What is the current weather of Delhi?

Now as you know AI does not have access to real time information. But we can write a simple function to fetch real time weather by using any weather apis.

What if the LLM can you this function written by us? In technical terms this is called “Tool Calling”. Tools are nothing but Functions written to solve any particular tasks.

async function getWeatherDetails(cityName) {
  console.log("🔍 getWeatherDetails");
  const response = await axios.get(
    `http://api.openweathermap.org/data/2.5/weather?q=${cityName}&appid=${API_KEY}&units=metric`
  );
  return response.data.main.temp;
}

Now in the System prompt:

  • we tell to the LLM that you have access to a tool called getWeatherDetails which can be used to get real time weather.

  • We also instruct the LLM to use Chain of Thought (CoT) prompting, this means we encourage the model to think step by step before giving the final answer.

      const SYSTEM_PROMPT = `
      You are an AI similar to Chatgpt, gemini and claude. But you should answer in the below format only.
      You should always follow START, THINK, OUTPUT format. and if you have acces to tools which can be used to get specific answer. This tools are used only for specific set of question.
      The first step is to analyze the question by the user. After analyzing the question, get to know what user is asking.
      If there is a tool available for solving this question, say the developer to use the tool, once the developer will use the tool and will provide you the answer. This step where the developer provides you the answer is the OBSERVE step.
      So you should wait until you get an output from the developer. Without the result from the observe step do not proceed further.
    
      Available Tools:
       - getWeatherDetails(cityName) - function used to get weather details of a city.
    
      Schema Definitions:
        - role : "assistant" | "user" | "developer"
        - step : "START" | "THINK" | "TOOL" | "OBSERVE" | "ANALYZE" |  OUTPUT"
        - content : string
        - tool_name: string
    
      User: Who is the founder of Apple?
    
      ASSISTANT:
      { "role": "assistant", "step": "START", "content": "The user has asked about the founder of Apple." }
    
      ASSISTANT:
      { "role": "assistant", "step": "THINK", "content": "There is no need to use any tool for this question." }
    
      ASSISTANT:
      { "role": "assistant", "step": "OUTPUT", "content": "The founder of Apple is Steve Jobs." }
    
      Eg: What is the weather of Delhi?
    
      User: What is the weather of Delhi and New York? Which city is hotter?
    
      ASSISTANT:
      { "role": "assistant", "step": "START", "content": "The user has asked about the weather details of a city called Delhi" }
    
      ASSISTANT:
      { "role": "assistant", "step": "THINK", "content": "Since the question is about weather, I can use a tool called getWeatherDetails" }
    
      ASSISTANT:
      { "role": "assistant", "step": "TOOL", "tool_name": "getWeatherDetails(cityName)", "input": "Delhi" }
    
      DEVELOPER:
      { "role": "developer", "step": "OBSERVE", "content": "The weather in Delhi is 32C" }
    
      ASSISTANT:
      { "role": "assistant", "step": "THINK", "content": "User has also asked the weather of New York. I can use the same tool to get the weather of New York." }
    
      ASSISTANT:
      { "role": "assistant", "step": "TOOL", "tool_name": "getWeatherDetails(cityName)", "input": "New York" }
    
      DEVELOPER:
      { "role": "developer", "step": "OBSERVE", "content": "The weather in New York is 28C" }
    
      ASSISTANT:
      { "role": "assistant", "step": "ANALYZE", "content": "The weather in Delhi is 32C which is hot for this time of year. The weather in New York is 28C which is cool for this time of year." }
    
      ASSISTANT:
      { "role": "assistant", "step": "OUTPUT", "content": "The weather in Delhi is 32C while the weather in New York is 28C. So Delhi is hotter." }
    
          Rules:
          - Strictly follow the output JSON format
          - Always follow the output in sequence that is START, THINK, OBSERVE and OUTPUT.
          - Always perform only one step at a time and wait for other step.
          - Alway make sure to do multiple steps of thinking before giving out output.
          - For every tool call always wait for the OBSERVE which contains the output from tool
      `;
    

    This helps the agent to:

    • Start by identifying what the user wants.

    • Think about whether it needs to use any tool.

    • If a tool is required, trigger the TOOL step with correct inputs.

    • Wait for the developer to respond in the OBSERVE step with tool results.

    • Then Analyze the tool outputs.

    • Finally, produce a clean and accurate OUTPUT for the user.

What happens in Tool Calling? Step By Step

  • User asks “What is the weather of Delhi”?

Start Phase:

{
  "role": "assistant",
  "step": "START",
  "content": "The user has asked about the weather in Delhi and New York."
}

Think Phase:



{
  "role": "assistant",
  "step": "THINK",
  "content": "To answer this, I need real-time weather data. I can use the tool getWeatherDetails."
}

Tool Phase:

{
  "role": "assistant",
  "step": "TOOL",
  "tool_name": "getWeatherDetails(cityName)",
  "input": "Delhi"
}

Now in our code we handle this condition by using:

    if (parsedContent.step === "TOOL") {
      console.log("🔍 tool call");
      console.log("🔍 parsedContent", parsedContent);
      const responseFromTool = await getWeatherDetails(parsedContent.input);
      console.log("🔍 responseFromTool", responseFromTool);
      messages.push({
        role: "developer",
        content: JSON.stringify({
          step: "OBSERVE",
          content: `The weather in ${parsedContent.input} is ${responseFromTool}`,
        }),
      });
      continue;
    }

// Response - 32C

Observe Phase:

{
  "role": "developer",
  "step": "OBSERVE",
  "content": "The weather in Delhi is 32°C."
}

Analyze Phase:

{
  "role": "assistant",
  "step": "ANALYZE",
  "content": "The weather in Delhi is 32C which is warm than average"
}

Output Phase:

{
  "role": "assistant",
  "step": "OUTPUT",
  "content": "The weather in Delhi is 32°C which is hotter."
}

Conclusion:

AI agents are the missing piece that bring these brains to life. By giving them the ability to reason step-by-step, access tools, and take actions, we’re transforming passive AI into active collaborators.

Whether it’s fetching real-time weather, automating research, booking meetings, or deploying code, AI agents unlock a future where machines don’t just answer questions, they get things done.

0
Subscribe to my newsletter

Read articles from balu sekhar directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

balu sekhar
balu sekhar

I’m a Computer Science graduate student at the University of North Texas (graduating May 2025) with Full Stack Developer experience in web application development and cloud infrastructure. I’ve contributed to real-world projects through internships and independent work, focusing on building scalable, secure, and performance-optimized applications.