🧠 Building MiniCursor: Your Own AI Coding Agent Using Gemini

AI is no longer just about answering questions it's about building, reasoning, and acting. That’s what agents do. In this post, we’ll walk through what agents are, how they differ from large language models (LLMs), and how we built our own coding assistant named MiniCursor that builds HTML/CSS/JS apps using custom tools and Google's Gemini API.
What is an AI Agent?
At its core, an AI agent is like a software intern that knows how to think, plan, and take action. It doesn't just reply to a prompt like a chatbot it has a purpose, chooses tools, observes outcomes, and adjusts its actions.
Think of it this way:
An LLM is reactive. It responds to a single prompt without memory or goals.
An agent is proactive. It can break a goal into steps, use functions/tools, and execute code or system commands.
This ability to reason and act is what makes agents so powerful.
Introducing: MiniCursor
MiniCursor is a terminal-based AI coding agent that builds frontend projects. It’s designed to:
Take a natural language goal like "Build a calculator"
Plan out steps to build it
Create folders and write code into files
Use custom-defined tools to interact with your system
Respond in a structured JSON format
Let’s break down how it works.
Setting up the Environment
First, we load the Gemini API with Python using the google-generativeai
package and dotenv
to manage secrets.
from dotenv import load_dotenv
import os
import google.generativeai as genai
load_dotenv()
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
model = genai.GenerativeModel("gemini-1.5-flash")
Building Custom Tools
MiniCursor supports two tools: one for running terminal commands, and one for writing files.
def run_command(cmd: str):
print(f"📟 Running: {cmd}")
result = os.system(cmd)
return f"✅ Command executed: {cmd}" if result == 0 else f"❌ Failed: {cmd}"
def write_file(path: str, content: str):
try:
with open(path, "w") as file:
file.write(content)
return f"✅ File written: {path}"
except Exception as e:
return f"❌ Error: {e}"
We map the tool names to their functions:
pythonCopyEditavailable_tools = {
"run_command": run_command,
"write_file": write_file,
}
Giving It Instructions
We set a system prompt to tell the agent how to behave, what tools it has and what format to reply in. This is crucial to steer its reasoning.
pythonCopyEditSYSTEM_PROMPT = """
You are Tecy — an AI assistant that builds basic frontend apps using HTML, CSS, and JS.
You follow the process: start → plan → action → observe → output.
Rules:
- Respond only in JSON format:
{
"step": "action",
"function": "write_file",
"input": {
"index.html": "<html>...</html>",
"styles.css": "body { ... }",
"script.js": "const x = ...;"
}
}
- Do not use markdown code blocks inside JSON.
"""
The Main Loop
We then initialize the chat, take user input, and loop through the conversation step-by-step.
pythonCopyEditmessages = [{"role": "user", "parts": [SYSTEM_PROMPT]}]
chat = model.start_chat(history=messages)
while True:
query = input("\n> ")
response = chat.send_message(f"{query}\nStart planning.")
The agent responds with a plan. If the step is "action"
, it chooses a tool and sends its input as JSON. For example, to write a file, it might return:
jsonCopyEdit{
"step": "action",
"function": "write_file",
"input": {
"index.html": "<!DOCTYPE html><html>...</html>"
}
}
We parse this response and call the right function:
pythonCopyEditif tool == "write_file":
for file_name, file_content in tool_input.items():
result = write_file(file_name, file_content)
print(f"📁 {result}")
After execution, we send an observation:
pythonCopyEditresponse = chat.send_message(json.dumps({
"step": "observe",
"output": "Tool executed successfully."
}))
The agent processes the result and either moves to the next step or prints the final output.
Why This Matters
This isn’t just a chatbot writing code snippets. This is an agent that thinks in steps, decides what tool to use, runs shell commands, writes files, and handles structured output.
You can extend MiniCursor with more tools:
read_file
to read and summarize codeopen_browser
to test a live servergit_commit
to save versions
Final Thoughts
MiniCursor is a minimal but powerful example of how agents go beyond LLMs. By teaching them to plan, act, and observe, you get AI that does, not just says.
You give it a goal.
It builds a project.
All in your terminal.
GitHub Repo
Feel free to fork it, try it out, or build your own agent with your favorite tools. This is just the beginning of building tool-using AIs.
Subscribe to my newsletter
Read articles from Piyush Gaud directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
