🧠 Building MiniCursor: Your Own AI Coding Agent Using Gemini

Piyush GaudPiyush Gaud
4 min read

AI is no longer just about answering questions it's about building, reasoning, and acting. That’s what agents do. In this post, we’ll walk through what agents are, how they differ from large language models (LLMs), and how we built our own coding assistant named MiniCursor that builds HTML/CSS/JS apps using custom tools and Google's Gemini API.

What is an AI Agent?

At its core, an AI agent is like a software intern that knows how to think, plan, and take action. It doesn't just reply to a prompt like a chatbot it has a purpose, chooses tools, observes outcomes, and adjusts its actions.

Think of it this way:

  • An LLM is reactive. It responds to a single prompt without memory or goals.

  • An agent is proactive. It can break a goal into steps, use functions/tools, and execute code or system commands.

This ability to reason and act is what makes agents so powerful.

Introducing: MiniCursor

MiniCursor is a terminal-based AI coding agent that builds frontend projects. It’s designed to:

  • Take a natural language goal like "Build a calculator"

  • Plan out steps to build it

  • Create folders and write code into files

  • Use custom-defined tools to interact with your system

  • Respond in a structured JSON format

Let’s break down how it works.

Setting up the Environment

First, we load the Gemini API with Python using the google-generativeai package and dotenv to manage secrets.

from dotenv import load_dotenv
import os
import google.generativeai as genai

load_dotenv()
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
model = genai.GenerativeModel("gemini-1.5-flash")

Building Custom Tools

MiniCursor supports two tools: one for running terminal commands, and one for writing files.

def run_command(cmd: str):
    print(f"📟 Running: {cmd}")
    result = os.system(cmd)
    return f"✅ Command executed: {cmd}" if result == 0 else f"❌ Failed: {cmd}"

def write_file(path: str, content: str):
    try:
        with open(path, "w") as file:
            file.write(content)
        return f"✅ File written: {path}"
    except Exception as e:
        return f"❌ Error: {e}"

We map the tool names to their functions:

pythonCopyEditavailable_tools = {
    "run_command": run_command,
    "write_file": write_file,
}

Giving It Instructions

We set a system prompt to tell the agent how to behave, what tools it has and what format to reply in. This is crucial to steer its reasoning.

pythonCopyEditSYSTEM_PROMPT = """
You are Tecy — an AI assistant that builds basic frontend apps using HTML, CSS, and JS. 
You follow the process: start → plan → action → observe → output.

Rules:
- Respond only in JSON format:
{
  "step": "action",
  "function": "write_file",
  "input": {
    "index.html": "<html>...</html>",
    "styles.css": "body { ... }",
    "script.js": "const x = ...;"
  }
}
- Do not use markdown code blocks inside JSON.
"""

The Main Loop

We then initialize the chat, take user input, and loop through the conversation step-by-step.

pythonCopyEditmessages = [{"role": "user", "parts": [SYSTEM_PROMPT]}]
chat = model.start_chat(history=messages)

while True:
    query = input("\n> ")
    response = chat.send_message(f"{query}\nStart planning.")

The agent responds with a plan. If the step is "action", it chooses a tool and sends its input as JSON. For example, to write a file, it might return:

jsonCopyEdit{
  "step": "action",
  "function": "write_file",
  "input": {
    "index.html": "<!DOCTYPE html><html>...</html>"
  }
}

We parse this response and call the right function:

pythonCopyEditif tool == "write_file":
    for file_name, file_content in tool_input.items():
        result = write_file(file_name, file_content)
        print(f"📁 {result}")

After execution, we send an observation:

pythonCopyEditresponse = chat.send_message(json.dumps({
  "step": "observe",
  "output": "Tool executed successfully."
}))

The agent processes the result and either moves to the next step or prints the final output.

Why This Matters

This isn’t just a chatbot writing code snippets. This is an agent that thinks in steps, decides what tool to use, runs shell commands, writes files, and handles structured output.

You can extend MiniCursor with more tools:

  • read_file to read and summarize code

  • open_browser to test a live server

  • git_commit to save versions

Final Thoughts

MiniCursor is a minimal but powerful example of how agents go beyond LLMs. By teaching them to plan, act, and observe, you get AI that does, not just says.

You give it a goal.
It builds a project.
All in your terminal.

GitHub Repo

MiniCursor on GitHub

Feel free to fork it, try it out, or build your own agent with your favorite tools. This is just the beginning of building tool-using AIs.

0
Subscribe to my newsletter

Read articles from Piyush Gaud directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Piyush Gaud
Piyush Gaud