Deep Comparison of MCP, A2A, and ANP Interaction Modes: Differences in Information Organization

This is the third article in the MCP, A2A, and ANP comparison series. The previous two articles:

This article will deeply compare the differences in interaction modes between each protocol, which determine how a tool or agent organizes its information, thus determining the method of information processing and interaction characteristics.

First, the conclusions:

MCP (Remote Call): The server directly exposes all tools and resource lists to the client, which passes this information to the model. The model then calls specific tools or accesses specific resources via RPC as needed.
A2A (Task Outsourcing): Agents showcase an overview of their capabilities, and client agents break down complex tasks into subtasks, distributing them to agents with corresponding expertise for execution, ultimately integrating the results.
ANP (Data Crawling): Agents organize their information and interfaces into a network linked by URLs. Client agents, much like web crawlers, start from the entry point of the description document, acquire information as needed, make decisions locally, and finally execute operations through discovered interfaces.

Core Differences Among the Three:

A2A vs MCP: A2A encapsulates implementation details, with clients only concerned about "what to do" rather than "how to do it," merely requiring task submission and awaiting results.
ANP vs MCP: MCP returns all tools/resources at once, while ANP only provides hierarchical links and descriptions, allowing agents to explore and acquire information as needed.
ANP vs A2A: Both address agent interaction, but A2A emphasizes complex task decomposition and collaboration, while ANP focuses on flexible information acquisition and local decision-making to reduce privacy leak risks.

ANP is built on modern Web technologies, and agents supporting the ANP protocol are typically referred to as WebAgents.

MCP Interaction Mode

MCP's interaction mode adopts the classic client-server (C/S) architecture. During the initial request, capabilities are negotiated, including support for tools, resources, etc.

mcp-init

Servers generally support two core interfaces: tools/list and resources/list, through which clients can obtain information about all tools and resources. This information includes metadata, i.e., descriptions of tools and resources, to inform the model about the capabilities of the tools and how to call them, as well as detailed information about the resources.

The model calls different tools and reads different resources as needed.

For example, with tools, the interaction is as follows:

mcp-tools

In this interaction mode, the Server acts as a passive party, passively providing tools and resources, with decision-making power residing with the client's model.

A2A Interaction Mode

A2A is a flexible P2P interaction mode. In this mode, one agent (the client agent) can directly communicate and collaborate with one or more other agents (remote agents), breaking down a large task into smaller ones and completing them with other agents, similar to multiple professional assistants working together.

The interaction process typically follows these steps:

Discovery and Selection: The client agent first needs to discover and locate remote agents suitable for the current task. It can query multiple agents' "Agent Cards" — which are like skill resumes, listing each agent's capabilities and service methods. The client agent selects one or more most suitable remote agents based on task requirements.
Task Splitting (Optional): For complex tasks, the client agent can split them into multiple subtasks.
Task Assignment and Collaboration: The client agent establishes connections with selected remote agents through the A2A protocol and sends corresponding (sub)task requests to them. For example, sending subtask A to remote agent A, subtask B to remote agent B.
Multi-round Interaction and Result Collection: Each remote agent independently executes the task assigned to it. During execution, they can engage in multi-round interactions with the client agent through messages, providing feedback on progress, asking clarifying questions, or returning intermediate results. The client agent separately receives feedback and results from different remote agents.
Result Aggregation and Completion: After all subtasks are completed, the client agent collects and integrates results from various remote agents, ultimately forming a complete solution for the entire task.

The entire communication process can include task context and status tracking, ensuring that dialogues around specific (sub)tasks proceed in an orderly manner.

mermaid

sequenceDiagram
    participant Client Agent
    participant Remote Agent A
    participant Remote Agent B

    Note over Client Agent: 1. Discovery Phase
    Client Agent->>Remote Agent A: Get Agent Card (/.well-known/agent.json)
    Remote Agent A-->>Client Agent: Return Agent Card (skills, authentication)
    Client Agent->>Remote Agent B: Get Agent Card (/.well-known/agent.json)
    Remote Agent B-->>Client Agent: Return Agent Card (skills, authentication)

    Note over Client Agent: 2. Task Splitting
    Client Agent-->>Client Agent: Split original task into Subtask A & Subtask B

    Note over Client Agent: 3. Subtask Submission to Remote Agent A
    Client Agent->>Remote Agent A: Send Subtask A (tasks/send)

    Note over Client Agent: 4. Subtask Submission to Remote Agent B
    Client Agent->>Remote Agent B: Send Subtask B (tasks/send)

    Note over Client Agent: 5. Collect Result A
    Remote Agent A-->>Client Agent: Return result A

    Note over Client Agent: 6. Collect Result B
    Remote Agent B-->>Client Agent: Return result B

    Note over Client Agent: 7. Task Completion & Aggregation
    Client Agent-->>Client Agent: Aggregate results and deliver final result

In this mode, both the Client Agent and Remote Agent process tasks based on input, and both parties collaborate through clear task descriptions and result interactions.

ANP Interaction Mode

ANP (Agent Network Protocol) is also a P2P interaction mode, similarly aimed at solving communication problems between agents. It has a fundamental difference from A2A:

A2A adopts a "task outsourcing" model, where agents split tasks and assign them to other agents for execution
ANP adopts a "data crawling" model, where agents only acquire information from other agents, make decisions locally, and then execute operations through API calls

ANP's core feature is the use of "Linked Data" technology to organize all of an agent's capabilities, interfaces, and information into a navigable data network. This differs from MCP's approach of directly returning all tools and resource lists.

ANP Interaction Process

anp-interaction-flow

The ANP interaction process is like a web crawler browsing web pages:

Entry Discovery: The agent first obtains the URL of another agent's "agent description document" (similar to a website's homepage)
Capability Discovery: From the description document, the agent learns about the counterpart's skills, interfaces, and available resources, as well as URLs linking to more detailed documents
Selective Acquisition: The agent only accesses those URLs relevant to the current task's needs, acquiring necessary information (rather than all information)
Recursive Navigation: Each document may contain new URL links, allowing the agent to continue navigating and pulling as needed
Local Decision Making: After collecting sufficient information, the agent makes logical judgments and decisions within its own model
Interface Calling: Finally, it executes specific operations through discovered API interfaces

The advantage of this approach is that the agent only acquires the information it truly needs, saving data transmission and maintaining decision-making control in its own hands.

Differences in Information Organization Methods

The differences in interaction modes described above are essentially differences in how each protocol organizes the information that agents expose externally:

MCP (Full Disclosure): Through RPC calls, all tool and resource lists are disclosed at once, allowing client agents to directly select and call the tools they need
A2A (Capability Cards): Through Agent Cards, agents showcase their skills and capability overviews without revealing specific implementation details, clients only need to submit tasks and wait for results
ANP (Linked Network): Using Linked-Data technology to organize agent information into a network structure, client agents can selectively acquire specific information as needed, much like web crawlers

Deep Comparison of MCP, A2A, and ANP Interaction Modes: Differences in Information Organization ​

MCP Interaction Mode ​

A2A Interaction Mode ​

ANP Interaction Mode ​