Introduction to Multi-Agents: What are they?
Imagine receiving an email from your manager, and you need to craft the perfect reply. What do you do?
If I had asked you this earlier, your answer would likely have been: draft the reply, proofread it, and maybe run it through Grammarly. But now, you can simply turn to ChatGPT. You copy the email, write a prompt, and let ChatGPT generate a response. You paste it into your email app, hit send, and you’re done.
But what if you’re not satisfied with the initial response?
You might tweak the prompt, ask ChatGPT to refine the content or change the tone from friendly to formal. It could take multiple iterations before you arrive at the perfect reply. This back-and-forth process can be tedious—especially if you're in a time crunch and need to reply quickly.
Now, imagine having an assistant that automatically reads your emails and replies to specific people on your behalf, going through these iterations by itself. All you have to do is sit back and focus on your work.
But what if I told you that you could build an application that goes even further—one that writes code, executes it, detects errors, fixes them, and deploys the project automatically?
At this point, you might think I’m exaggerating. You’ve probably encountered situations where ChatGPT struggled to solve a LeetCode problem or pinpoint the issue with your code. So naturally, the idea that it could build and deploy a functional app with minimal external input might sound too good to be true.
I understand the skepticism—I’ve had those same doubts myself. However, advancements in AI are closing the gap between these possibilities and reality faster than ever before.
The answer to your question is Multi-Agent Systems!
So, What Are Multi-Agents? Wait—What Are Agents?
Before diving into multi-agents, let’s first break down what an agent is. In the simplest terms, an agent is a self-contained software entity designed to perform specific tasks autonomously. Think of it like a tiny digital assistant that can act based on the environment, make decisions, and sometimes even learn from past interactions. Agents are guided by rules or goals, but they can also adapt based on the situation.
For example, a chatbot you interact with on a website is an agent—it has a limited purpose, like answering frequently asked questions or helping you reset your password. Another example would be the email filters in your inbox that automatically sort your messages based on rules or patterns.
Lets just say that an Agent is LLM that is given some tools. Tools like sending email tool, flight booking tool, make reservations tool etc. This architecture gives you a glimpse into how an agent functions.
User Input – The Spark That Sets Things in Motion
It all begins with you, the user, providing an input—whether that’s a prompt, a command, or a request. This is the ignition point that triggers the agent to start working. You don’t need to overthink; just feed the system a clear task, and it takes care of the rest.LLM (Router) – The Brain That Figures It All Out
The LLM (Large Language Model) acts as the router, receiving your input and determining what needs to happen next. Its job? Understand the intent behind your query and map out the necessary steps. Think of it as a smart orchestrator—breaking down the task and figuring out which tools to call or whether it needs to retrieve information from memory.Memory – The Context Keeper
The memory module stores relevant details and past interactions. This allows the agent to maintain context across multiple prompts. For instance, if you're iteratively refining an email, the memory ensures the system remembers the previous versions—so it can build on them rather than starting from scratch every time.Tools – The Specialists for Each Job
These are specialized tools that perform individual tasks. Instead of the LLM trying to do everything itself, it offloads specific parts of the process to these tools. Each tool has its own area of expertise:Tool 1 might handle data analysis or calculations.
Tool 2 could manage API calls to external services.
Tool 3 might focus on generating code or performing tests.
The beauty here is that the LLM intelligently selects which tool to engage based on what the task demands.
And What About Multi-Agent Systems?
A multi-agent system (MAS) takes this concept to the next level. It’s a collection of multiple agents working together, either collaboratively or competitively, to achieve a larger goal. Each agent in the system is specialized in performing a particular task, and these agents interact with each other, often coordinating their efforts to solve complex problems that are beyond the capability of any single agent.
Think of a sports team—each player (agent) has a role, like defending or scoring goals, and they coordinate to win the game. Similarly, in multi-agent systems, agents communicate with one another to divide work, share knowledge, and reach decisions collectively.
Remember JARVIS from Iron Man? That assistant could do a whole range of tasks—right from ordering your coffee to sending emails, writing code, or running tests on the ULTRON Project. How could it pull off such a wide variety of complex operations while today’s modern-day AI assistant(not taking any names) often struggles to crack that tricky bug that you’re trying to solve since last 6 hours?
This is where the idea of multi-agent systems comes in. JARVIS isn’t just one giant system handling every task. Instead, we can think of it as a network of specialized agents—each focusing on a particular function, working together seamlessly. Each task—like hacking into a system, running diagnostics, or writing code—is handled by individual agents within the larger JARVIS ecosystem.
In the caption, JARVIS mentions:
- “I’ve hacked into the mainframe and disabled their algorithms.”
Here’s how we can deconstruct this scenario as a multi-agent system:
Hacking the Mainframe – Agent 1: Intrusion Specialist
One specialized agent focuses solely on cyber-infiltration. This agent handles identifying vulnerabilities, bypassing firewalls, and gaining unauthorized access to secure systems.Disabling Algorithms – Agent 2: Algorithm Manager
A second agent knows how algorithms operate and is skilled in shutting down or disrupting them. This agent might focus on finding the key processes running the enemy’s systems and neutralizing them.Real-Time Communication – Agent 3: System Communicator
JARVIS ensures that each agent reports back in real-time, updating Tony on what’s happening. This agent handles the coordination between Tony's commands and the ongoing processes, ensuring actions are taken in sync without conflict.Diagnostics and Recovery – Agent 4: System Monitor
Another agent ensures that once the target algorithms are disabled, the system remains stable and no secondary processes kick in. Think of it like keeping an eye on error logs, network activity, and process restarts.
In essence, JARVIS is not just one entity doing everything; it’s a multi-agent system where each agent is responsible for a specific piece of the puzzle. This allows for parallel, optimized operations, which is what makes the system so powerful. Tony can focus on strategy and decisions while the agents handle the tactical stuff—just like in modern AI systems where multiple specialized models or agents work together.
This concept is what makes multi-agent systems so exciting—they can handle diverse and complex problems by dividing and conquering.
That’s it on the Introduction to Multi-Agents
In a nutshell, multi-agent systems break complex tasks into smaller parts, with specialized agents working together to get the job done. They’re already making waves across industries—from automating workflows to powering intelligent platforms.
Stay tuned! I’ll be back with another blog diving deeper into multi-agent architectures, their components, and how you can build your own.
Subscribe to my newsletter
Read articles from Ayush Gupta directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Ayush Gupta
Ayush Gupta
I am a software engineer working in the field of Generative AI and Cloud.