A2A Protocol Simplified: Quick Guide

Definition: A2A stands for agent-to-agent → focusing on agent communication, focusing on how agents interact with one another.

Benefits of Agent-to-Agent Communication

Benefit	Description
Interoperability (A standard)	Enables agents from different systems to communicate and collaborate
Performance	Allows division of labor and specialization for faster problem-solving, Enables specialized domain knowledge, reduces prompt cost, and improves accuracy.
Scalability for Your System	Supports easy expansion and integration of new agents to your system
Streamlined Maintenance	Simplifies integration, updates, and troubleshooting
A way for you to share your model	Build a model and let others connect to it with this protocol

Real-World Example

Manus demonstrates a successful implementation by providing a unified platform with multiple specialized agents. These agents handle different tasks and likely communicate using an agent-to-agent protocol. This architecture enables Manus to achieve improved performance through specialization.

YC combinator video on Manus: The Next Breakthrough In AI Agents Is Here

The picture above provides a simplified overview of how Manus operates, and the communication appears similar to what the A2A protocol aims to achieve.

Conceptual Overview

The Agent2Agent (A2A) protocol facilitates communication between independent AI agents. Here are the core concepts:

Agent Card: A public metadata file (usually at /.well-known/agent.json) describing an agent's capabilities, skills, endpoint URL, and authentication requirements. Clients use this for discovery.
A2A Server: An agent exposing an HTTP endpoint that implements the A2A protocol methods (defined in the json specification). It receives requests and manages task execution.
A2A Client: An application or another agent that consumes A2A services. It sends requests (like tasks/send) to an A2A Server's URL.
Task: The central unit of work. A client initiates a task by sending a message (tasks/send or tasks/sendSubscribe). Tasks have unique IDs and progress through states (submitted, working, input-required, completed, failed, canceled).
Message: Represents communication turns between the client (role: "user") and the agent (role: "agent"). Messages contain Parts.
Part: The fundamental content unit within a Message or Artifact. Can be TextPart, FilePart (with inline bytes or a URI), or DataPart (for structured JSON, e.g., forms).
Artifact: Represents outputs generated by the agent during a task (e.g., generated files, final structured data). Artifacts also contain Parts.
Streaming: For long-running tasks, servers supporting the streaming capability can use tasks/sendSubscribe. The client receives Server-Sent Events (SSE) containing TaskStatusUpdateEvent or TaskArtifactUpdateEvent messages, providing real-time progress.
Push Notifications: Servers supporting pushNotifications can proactively send task updates to a client-provided webhook URL, configured via tasks/pushNotification/set.

Typical Flow:

Discovery: Client fetches the Agent Card from the server's well-known URL.
Initiation: Client sends a tasks/send or tasks/sendSubscribe request containing the initial user message and a unique Task ID.
Processing:
- (Streaming): Server sends SSE events (status updates, artifacts) as the task progresses.
- (Non-Streaming): Server processes the task synchronously and returns the final Task object in the response.
Interaction (Optional): If the task enters input-required, the client sends subsequent messages using the same Task ID via tasks/send or tasks/sendSubscribe.
Completion: The task eventually reaches a terminal state (completed, failed, canceled).

Examples (technical details)

1. Discovery

Fetch the agent's metadata (AgentCard):

HTTP GET Request

GET https://agent-domain.com/.well-known/agent.json

2. Initiation

Send a task to the agent:

Example: tasks/send or tasks/sendSubscribe (for streaming)

POST https://agent-domain.com/a2a/tasks/send
Content-Type: application/json

{
  "taskId": "task-12345",
  "message": {
    "role": "user",
    "parts": [
      {
        "type": "text",
        "text": "Summarize this article: https://example.com/article"
      }
    ]
  }
}

3. Processing

If streaming, the client receives Server-Sent Events (SSE) from the server:

Example SSE:

event: TaskStatusUpdateEvent
data: {
  "taskId": "task-12345",
  "status": "working"
}

event: TaskArtifactUpdateEvent
data: {
  "taskId": "task-12345",
  "artifacts": [
    // artifact data goes here
  ]
}

Let me know if you want to see how this is handled with EventSource in a browser or via server code.

4. Interaction (Optional)

If the agent requires more input, the client responds:

POST https://agent-domain.com/a2a/tasks/send

{
  "taskId": "task-12345",
  "message": {
    "role": "user",
    "parts": [
      {
        "type": "text",
        "text": "No, use informal Spanish instead."
      }
    ]
  }
}

5. Completion

Final response or confirmation after task is done:

Non-streaming: Response contains the full task:

{
  "taskId": "task-12345",
  "status": "completed",
  "artifacts": [
    {
      "parts": [
        {
          "type": "text",
          "text": "Qué tal, ¿cómo estás?"
        }
      ]
    }
  ]
}

Outcomes from completion

-> Completed

-> Failed

-> Canceled

Code examples here: https://github.com/google/A2A/tree/main/samples

A2A protocol explained fast and simple

Table of contents