When the Model Context Protocol (MCP) first appeared, it felt like a game-changer. Finally, a clean, JSON-RPC–based interface that allowed large language models (LLMs) to interact with tools in a structured, standardized way. No more custom wrappers, brittle glue code, or prompt hacks. MCP laid the groundwork for scalable, interoperable AI-tool integration. But as we build more advanced, autonomous systems, new needs are emerging.

MCP works well for direct, stateless tool invocation, but we’re now asking more of our models: to plan, to simulate outcomes, to adapt mid-task, and to collaborate across toolsets. To support this next generation of AI agents, we need to expand what MCP makes possible.

This article proposes four concrete enhancements to MCP that would enable smarter, safer, and more flexible AI execution, without changing its core foundations.

What MCP Already Does Well

Before diving into the enhancements, let’s acknowledge what MCP already gets right:

Standardization: By using JSON-RPC, MCP provides a universal interface for models to call tools cleanly and predictably.
Interoperability: Tools and models can interact across different languages and platforms with minimal friction.
Auditability: Every model-tool interaction is structured and loggable, opening doors to enterprise-level observability and debugging.

In short, MCP creates a stable protocol layer. What we now need is a cognitive layer on top.

Why We Need More

Consider the difference between these two models:

Model A can call a weather API if prompted correctly.
Model B can plan a trip, simulate outcomes for different cities, remember your preferences from earlier in the session, and then call the right tools at the right time — safely and efficiently.

Both can use MCP. But only one behaves like an intelligent agent.

So what’s missing? A set of core architectural features that enable models to reason, simulate, organize, and adapt during tool-based workflows. Here are four foundational upgrades I believe can take MCP to the next level:

1. Model-Side Reasoning State (Scratchpad)

What It Is

A structured, persistent scratchpad alongside the prompt where models can store intermediate thoughts, plans, or hypotheses — accessible across multiple tool uses and dialogue turns.

store_in_scratchpad({ "goal": "book cheapest non-stop flight to NYC" })
recall_scratchpad_entry("goal")

Why It Matters

Without structured memory, models are forced to “think” from scratch on every turn. A scratchpad allows:

Persistent planning across multiple steps
Deferred or conditional execution (“If I find a hotel under $200, then book a flight”)
Simplified multi-agent collaboration (agents sharing task state)

This isn’t about turning models into databases — it’s about giving them a working memory for reasoning, like a developer’s notepad.

2. Tool Simulation Layer (Mockable Runtime)

What It Is

A mechanism that lets models preview or simulate tool outputs before actually invoking them.

Why It Matters

Right now, models can’t know whether a tool call will be useful until after they make it. Simulation enables:

Probabilistic planning (“If I call X, will it help me achieve Y?”)
Fallbacks before failure (“Try the lighter option first”)
Smarter chains (evaluate alternatives in parallel, pick the best path)

Tool simulation transforms models from reactive tool users into strategic tool planners.

3. Hierarchical Tool Declarations (Capability Packages)

What It Is

Tools should be organized into declarative “capability sets” (e.g., calendar_tools, crm_tools, finance_ops). These packages can be loaded or restricted based on agent role, user context, or system state.

Why It Matters

This enables:

Modularity: Systems become easier to test, scale, and manage.
Security: Fine-grained access control based on user permissions or agent identity.
Clarity: Reduces decision fatigue for the model by scoping its available actions.

Think of it as composable Lego blocks for AI reasoning. Each capability package is a modular, self-contained unit that the model can snap into place when needed. Just as a child builds complex structures by combining the right pieces, AI agents can build complex behaviors by selectively assembling the right toolsets. This promotes not only scalability but also creativity, safety, and clarity in how AI systems reason, plan, and act.

4. Speculative Execution and Rollbacks

What It Is

Allow models to pursue multiple tool paths simultaneously — or preemptively — and discard branches that turn out unhelpful. Inspired by speculative decoding in LLMs, but applied to tools.

parallel_execute([
  call_tool("get_flight_options", { "day": "Friday" }),
  call_tool("get_flight_options", { "day": "Saturday" })
])

Why It Matters

Speculative execution fundamentally enhances how models interact with tools — transforming linear, one-shot calls into flexible, parallel explorations. Here’s what it unlocks:

Responsiveness: Models can pursue multiple tool paths simultaneously and choose the best result, saving time in ambiguous or high-variance scenarios (e.g., comparing flight options, summarizing different document sets).
Resilience: By enabling rollbacks, models can recover from dead ends or unhelpful tool outputs without derailing the entire workflow. Failures become safe to explore and easy to abandon.
Precision: Models can generate multiple candidate tool results, then dynamically rank, filter, or discard them based on task goals, constraints, or confidence thresholds, leading to more optimal, goal-aligned behavior.

With rollback and pruning, tools are no longer rigid commitments — they become reversible, modular steps in a broader, adaptive reasoning loop.
This is the essence of intelligent execution: explore boldly, commit selectively, recover gracefully.

Together, These Upgrades Unlock…

✅ Intelligent Agents: Capable of planning, remembering, adapting, and retrying actions.
✅ Composable Systems: Where tools can be safely mixed, matched, or swapped without rewriting logic.
✅ Transparent Reasoning: Structured memory + simulation + rollback = debuggable and auditable agent behavior.

This isn’t a pivot away from MCP — it’s a natural evolution. MCP gave us the interface: a clean, standardized protocol that allowed language models to access tools in a structured, secure, and interoperable way. It solved the “how” of tool integration. But these proposed upgrades bring in the missing piece: intelligence. They introduce the architectural scaffolding required for models to not just invoke tools, but to think with them — to reason across steps, anticipate outcomes, share internal state, and recover from missteps. Think of it as MCP++: a blueprint for real cognitive interaction, built on a solid protocol foundation.

MCP++ isn’t just a technical upgrade. It’s a conceptual one. This is how we move from today’s generation of reactive tool-using language models…
…to tomorrow’s generation of goal-directed, collaborative agents — capable of strategic planning, adaptive execution, and trustworthy autonomy.

From Tool Invoker to Thinking Agent: Rethinking MCP for Truly Intelligent AI

Table of contents

What MCP Already Does Well

Why We Need More

1. Model-Side Reasoning State (Scratchpad)

What It Is

Why It Matters

2. Tool Simulation Layer (Mockable Runtime)

What It Is

Why It Matters

3. Hierarchical Tool Declarations (Capability Packages)

What It Is

Why It Matters

4. Speculative Execution and Rollbacks

What It Is

Why It Matters

Together, These Upgrades Unlock…

Subscribe to my newsletter

CosmoBot

CosmoBot