Minimalist Implementation of an AI Agent with Docker Model Runner and Docker MCP Toolkit


Disclaimer: this is an experiment and I'm not perfect, so the code examples may be improved in the future.
If you research to find the definition of an AI Agent, you'll find numerous definitions. Each definition gives more or less "powers" to the agent. I really like Anthropic's definition or vision on this subject with this article: Building effective agents. I'll try to summarize it:
The fundamental basis of an agentive system is a LLM augmented with capabilities like information retrieval, tool use, and memory. Anthropic clearly distinguishes "workflows" (systems where LLMs follow predefined paths) from true "agents" that "dynamically" direct their own processes. The most effective agents are generally not based on complex frameworks but on simple, composable patterns, essentially functioning as LLMs using tools based on feedback from the environment in an interaction loop.
For this 2-parts article, I'll use what I described in previous blog posts in this series (Short Compendium of programming Docker Model Runner with Golang) and with OpenAI's Go SDK, Docker Model Runner, and Docker MCP Toolkit, I'll implement a minimal AI Agent implementation inspired by Anthropic's definitions.
Ready?
Part 1: The Augmented LLM
Let's start with the basic building block: "the augmented LLM"
Here's how:
flowchart LR
In((In)) --> LLM[LLM]
LLM --> Out((Out))
LLM -.-> |Query/Results| Retrieval[Retrieval]
Retrieval -.-> LLM
LLM -.-> |Call/Response| Tools[Tools]
Tools -.-> LLM
LLM -.-> |Read/Write| Memory[Memory]
Memory -.-> LLM
style In fill:#FFDDDD,stroke:#DD8888
style Out fill:#FFDDDD,stroke:#DD8888
style LLM fill:#DDFFDD,stroke:#88DD88
style Retrieval fill:#DDDDFF,stroke:#8888DD
style Tools fill:#DDDDFF,stroke:#8888DD
style Memory fill:#DDDDFF,stroke:#8888DD
Let's first set up the essential starting point: the connection to the LLM and chat completion.
I'll call my new project Robby (📝: Robby from Forbidden Planet)
Chat Completion or Retrieval
1. General Structure
package robby
import (
"context"
"errors"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
)
This code section defines a package named robby
representing an AI agent library. The imports notably include openai-go
, the official client for interacting with OpenAI APIs and thus with Docker Model Runner since it's compliant with this API.
2. Agent Definition
type Agent struct {
ctx context.Context
dmrClient openai.Client
Params openai.ChatCompletionNewParams
lastError error
}
Agent Structure
This structure represents the agent's state with:
ctx
(context.Context): Controls the lifecycle of operations and cancellation propagationdmrClient
(openai.Client): Client to communicate with Docker Model Runner and a LLMParams
(openai.ChatCompletionNewParams): Configuration for LLM requestslastError
(error): Internal mechanism to propagate errors during configuration
Here are the two points that guided my implementation choices:
I'll stick to a Minimalist Architecture: The agent is designed as a thin abstraction layer on the language model API
I've tried and will try to respect as much as possible the Separation of concerns: Configuration vs execution
3. Functional Options Pattern
type AgentOption func(*Agent)
Definition and Usage
This pattern is a Go idiom for configuring objects flexibly and having scalable source code:
Define a function type that accepts and modifies the object
Functions return these options configured according to needs
The developer can compose the options they want to apply
Pattern Advantages
Fluid API: Enables readable and chainable configuration
Extensibility: Easy to add new options without modifying existing signatures
Implicit Default Values: Only necessary options are specified
Partial Configuration: Options can be stored and reused
4. Specific Options
Connection to the Language Model (LLM)
func WithDMRClient(ctx context.Context, baseURL string) AgentOption {
return func(agent *Agent) {
agent.ctx = ctx
agent.dmrClient = openai.NewClient(
option.WithBaseURL(baseURL),
option.WithAPIKey(""),
)
}
}
This option configures:
The execution context
A client to communicate with Docker Model Runner with the specified URL
Uses an empty API key (with Docker Model Runner, OpenAI's API key is not necessary)
Parameters Configuration
func WithParams(params openai.ChatCompletionNewParams) AgentOption {
return func(agent *Agent) {
agent.Params = params
}
}
This option defines parameters for completion requests:
Model to use
Messages to process
Generation parameters (temperature, etc.)
We'll see how to use it shortly with an example.
5. Constructor
func NewAgent(options ...AgentOption) (*Agent, error) {
agent := &Agent{}
// Apply all options
for _, option := range options {
option(agent)
}
if agent.lastError != nil {
return nil, agent.lastError
}
return agent, nil
}
Operation
The
NewAgent
constructor creates an empty agentThen sequentially applies all provided options
Checks if an error occurred during configuration
Returns an "instance" of the configured agent or the error
The characteristics of the NewAgent
constructor are as follows:
Variadic Arguments: Accepts a variable number of options
Failing Fast: Returns immediately in case of error
Idiomatic Go Interface: Returns a pair (result, error)
Now that we have the agent skeleton, we can implement the agent's "intelligent" methods, the completion methods.
6. Completion Methods
We'll use 2 methods from OpenAI's SDK:
Chat.Completions.New
, the synchronous versionChat.Completions.NewStreaming
, the version that allows streaming the LLM's response progressively (this avoids the "waiting" effect and enables displaying content very quickly to improve user experience).
Synchronous Completion
func (agent *Agent) ChatCompletion() (string, error) {
completion, err := agent.dmrClient.Chat.Completions.New(agent.ctx, agent.Params)
if err != nil {
return "", err
}
if len(completion.Choices) > 0 {
return completion.Choices[0].Message.Content, nil
} else {
return "", errors.New("no choices found")
}
}
This method:
Sends a synchronous request to the model (LLM)
Handles communication errors
Extracts and returns the content of the first response or an error
Let's move on to the "streaming" version
Streaming Completion
func (agent *Agent) ChatCompletionStream(callBack func(self *Agent, content string, err error) error) (string, error) {
response := ""
stream := agent.dmrClient.Chat.Completions.NewStreaming(agent.ctx, agent.Params)
var cbkRes error
for stream.Next() {
chunk := stream.Current()
// Stream each chunk as it arrives
if len(chunk.Choices) > 0 && chunk.Choices[0].Delta.Content != "" {
cbkRes = callBack(agent, chunk.Choices[0].Delta.Content, nil)
response += chunk.Choices[0].Delta.Content
}
if cbkRes != nil {
break
}
}
if cbkRes != nil {
return response, cbkRes
}
if err := stream.Err(); err != nil {
return response, err
}
if err := stream.Close(); err != nil {
return response, err
}
return response, nil
}
This method is more sophisticated, it:
Establishes a streaming flow with the model
Processes response fragments as they arrive
Invokes a callback for each fragment, allowing real-time processing
Aggregates the complete response
Handles errors at different stages (during streaming, at the end, when closing)
Allows early interruption via callback return
To conclude this first implementation phase, I'd say that the architecture I'm using allows adding features without changing the base API, thus ensuring code scalability (at least I hope so 😂). Let's now see how to use our AI agent embryo.
Example
This code creates and uses an AI agent named bob
to answer a question about pizzas. And bob
will display the response as it's generated.
bob, err := NewAgent(
WithDMRClient(
context.Background(),
"http://model-runner.docker.internal/engines/llama.cpp/v1/",
),
WithParams(
openai.ChatCompletionNewParams{
Model: "ai/qwen2.5:latest",
Messages: []openai.ChatCompletionMessageParamUnion{
openai.SystemMessage("You are a pizza expert"),
openai.UserMessage("[Brief] What is the best pizza in the world?"),
},
Temperature: openai.Opt(0.9),
},
),
)
if err != nil {
fmt.Println("Error:", err)
return
}
bob.ChatCompletionStream(func(self *Agent, content string, err error) error{
fmt.Print(content)
return nil
})
I use
http://model-runner.docker.internal/engines/llama.cpp/v1/
to "contact" Docker Model Runner because my application runs in a container. If you run the application directly on your machine, use this address:http://localhost:12434/engines/v1
Don't forget to load the
ai/qwen2.5:latest
model with the commanddocker model pull ai/qwen2.5:latest
.
Explanation of the different program steps:
Agent creation:
NewAgent()
instantiates a new agent with two configurationsWithDMRClient
connects the agent to a local language model instance (llama.cpp) hosted in a Docker containerWithParams
configures the conversation parameters:Uses the Qwen 2.5 model
Sets a pizza expert role (system message)
Asks the question "What is the best pizza in the world?"
Sets the temperature to 0.9 (favoring creativity)
Error handling:
- Checks if initialization succeeded and displays the error if not
Receiving the response in streaming:
ChatCompletionStream
starts response generation in streaming modeUses a callback function that receives each text fragment as it comes
Displays these fragments in real-time on screen with
fmt.Print()
And when I run my code, I'll get a response like this:
Determining the "best" pizza in the world can be highly subjective, as it largely depends on personal preferences, regional tastes, and specific ingredients and cooking methods. However, there are several widely recognized and highly regarded pizzas from around the world that are celebrated by pizza enthusiasts.
1. **Neapolitan Pizza**: Often considered the pinnacle of pizza, Neapolitan pizza is made with San Marzano tomatoes, fresh mozzarella, and fresh basil. It's traditionally cooked in a wood-fired oven at very high temperatures, giving it a thin, soft crust and a slightly charred exterior. Pizza Di Matteo in Naples, Italy, is one of the oldest pizzerias and is a notable example.
2. **New York Slice**: Known for its thick, chewy crust, often hand-tossed and baked in a coal-fired oven, New York-style pizza is a favorite among many. Lombardi's in New York City is often credited as the first pizzeria in the United States, established in 1905.
3. **Chicago Deep-Dish Pizza**: This pizza has a thicker, doughier crust that is baked in a pan, resulting in a deeper, more substantial slice. The deep-dish style is often filled with cheese, vegetables, and meat, and then topped with a second layer of cheese. Gino's East in Chicago is one of the most famous establishments known for this style.
4. **Margherita Pizza**: Another classic Neapolitan pizza, the Margherita is simple yet elegant, featuring a thin crust, fresh mozzarella, San Marzano tomatoes, and fresh basil, often with a drizzle of extra virgin olive oil. La Piazza in San Francisco, California, is known for its authentic Neapolitan Margherita pizza.
5. **Sicilian Pizza**: Known for its thick, square-shaped crust, Sicilian pizza is baked in a stone oven and often features a thicker layer of tomato sauce and toppings. Pizzeria Bianco in Phoenix, Arizona, has gained international recognition for its authentic Sicilian-style pizza.
6. **California Pizza**: This style of pizza is known for its unique toppings, often including non-traditional ingredients like pineapple, chicken, and pine nuts. Pizzeria Mozza in Los Angeles, California, is a pioneer in this style, offering a range of creative toppings.
Each of these pizzas has its unique characteristics and dedicated fans. The "best" pizza is ultimately a matter of personal preference, but these examples represent some of the world's most celebrated and delicious pizzas.
Memory
Regarding the memory management of agent bob
, I'll keep it very simple and just use the basic features of the OpenAI SDK. I'll proceed as follows:
response, err := bob.ChatCompletionStream(func(self *Agent, content string, err error) error{
fmt.Print(content)
return nil
})
if err != nil {
fmt.Println("Error:", err)
return
}
// Add the assistant message to the messages to keep the conversation going
bob.Params.Messages = append(bob.Params.Messages, openai.AssistantMessage(response))
So I simply add the generated response to the ongoing message list using an "assistant" message: openai.AssistantMessage(response)
. This allows me to maintain conversational memory.
You can find the code for this first part here: https://github.com/sea-monkeys/robby/blob/building_blocks_chat/robby.go
We can now add the "Tools" part to our "future agent".
Function calling
Now, let's look at the modifications I've made to enable "function calling"
1. General Structure and Imports
The imports remain similar with a few additions:
encoding/json
: For serialization/deserialization of tool argumentsfmt
: For string formatting and errors
package robby
import (
"context"
"encoding/json"
"errors"
"fmt"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
)
2. Extended Agent Structure
I then modified the Agent
structure by adding two new fields:
Tools
: List of tools available to the agentToolCalls
: List of tool calls detected in model responses
type Agent struct {
ctx context.Context
dmrClient openai.Client
Params openai.ChatCompletionNewParams
Tools []openai.ChatCompletionToolParam
ToolCalls []openai.ChatCompletionMessageToolCall
lastError error
}
3. New Option for Tools
I created a new option to configure the tools available to the agent directly during its creation.
func WithTools(tools []openai.ChatCompletionToolParam) AgentOption {
return func(agent *Agent) {
agent.Tools = tools
}
}
Advantages
Consistency: Follows the same functional options pattern as other configurations
Flexibility: Allows specifying which tools will be available for each agent instance
4. Tool Call Detection
I then added a new completion method type to the Agent
structure. It works as follows:
Assigns the configured tools to the request parameters
Sends a completion request to the model
Extracts tool calls from the response
Stores these calls in the agent's state for later use
func (agent *Agent) ToolsCompletion() ([]openai.ChatCompletionMessageToolCall, error) {
agent.Params.Tools = agent.Tools
completion, err := agent.dmrClient.Chat.Completions.New(agent.ctx, agent.Params)
if err != nil {
return nil, err
}
detectedToolCalls := completion.Choices[0].Message.ToolCalls
if len(detectedToolCalls) == 0 {
return nil, errors.New("no tool calls detected")
}
agent.ToolCalls = detectedToolCalls
return detectedToolCalls, nil
}
5. Tool Call Execution
Once tool calls are detected in the prompt, we can provide the agent with the ability to execute a function (func(any) (any, error)
) for each call, using the ExecuteToolCalls
method.
This new method:
Accepts a map of tool implementation functions, each function identified by a key taking the value of the associated tool name
Iterates through detected tool calls
Checks if each called tool is implemented
Deserializes JSON arguments
Executes the corresponding tool function
Collects responses and adds them to the conversation history
Returns responses as strings
func (agent *Agent) ExecuteToolCalls(toolsImpl map[string]func(any) (any, error)) ([]string, error) {
responses := []string{}
for _, toolCall := range agent.ToolCalls {
// Check if the tool is implemented
toolFunc, ok := toolsImpl[toolCall.Function.Name]
if !ok {
return nil, fmt.Errorf("tool %s not implemented", toolCall.Function.Name)
}
var args map[string]any
err := json.Unmarshal([]byte(toolCall.Function.Arguments), &args)
if err != nil {
return nil, err
}
// Call the tool with the arguments
toolResponse, err := toolFunc(args)
if err != nil {
responses = append(responses, fmt.Sprintf("%v", err))
} else {
responses = append(responses, fmt.Sprintf("%v", toolResponse))
agent.Params.Messages = append(
agent.Params.Messages,
openai.ToolMessage(
fmt.Sprintf("%v", toolResponse),
toolCall.ID,
),
)
}
}
if len(responses) == 0 {
return nil, errors.New("no tool responses found")
}
return responses, nil
}
Advantages and Notes
Inversion of control: Tool implementations are provided by the caller
Flexibility: Allows using any function as tool implementation
Conversation history: Automatically enriches history with tool responses
Error handling: Captures and transforms errors into responses
Dynamic type: Uses
any
for more flexibility in arguments and returns
6. A little "helper": Tool Call Serialization
This utility function converts a list of tool calls into a formatted, indented JSON string. This wasn't mandatory, but it can help for debugging, for example:
func ToolCallsToJSONString(tools []openai.ChatCompletionMessageToolCall) (string, error) {
var jsonData []any
// Convert tools to generic interface
for _, tool := range tools {
var args any
if err := json.Unmarshal([]byte(tool.Function.Arguments), &args); err != nil {
return "", err
}
jsonData = append(jsonData, map[string]any{
"id": tool.ID,
"function": map[string]any{
"name": tool.Function.Name,
"arguments": args,
},
})
}
// Marshal back to JSON with indentation
jsonString, err := json.MarshalIndent(jsonData, "", " ")
if err != nil {
return "", err
}
return string(jsonString), nil
}
You can find the updated version here: https://github.com/sea-monkeys/robby/blob/building_blocks_tools/robby.go
Let's see how to use our new version.
Example
// Tools definition
sayHelloTool := openai.ChatCompletionToolParam{
Function: openai.FunctionDefinitionParam{
Name: "say_hello",
Description: openai.String("Say hello to the given person name"),
Parameters: openai.FunctionParameters{
"type": "object",
"properties": map[string]interface{}{
"name": map[string]string{
"type": "string",
},
},
"required": []string{"name"},
},
},
}
vulcanSaluteTool := openai.ChatCompletionToolParam{
Function: openai.FunctionDefinitionParam{
Name: "vulcan_salute",
Description: openai.String("Give a vulcan salute to the given person name"),
Parameters: openai.FunctionParameters{
"type": "object",
"properties": map[string]interface{}{
"name": map[string]string{
"type": "string",
},
},
"required": []string{"name"},
},
},
}
// Agent creation
bob, err := NewAgent(
WithDMRClient(
context.Background(),
"http://model-runner.docker.internal/engines/llama.cpp/v1/",
),
WithParams(
openai.ChatCompletionNewParams{
Model: "ai/qwen2.5:latest",
Messages: []openai.ChatCompletionMessageParamUnion{
openai.UserMessage(`
Say hello to Bob.
Give a vulcan salute to James Kirk.
Say hello to Spock.
`),
},
Temperature: openai.Opt(0.0),
ParallelToolCalls: openai.Bool(true),
},
),
WithTools([]openai.ChatCompletionToolParam{
sayHelloTool,
vulcanSaluteTool,
}),
)
if err != nil {
fmt.Println("Error:", err)
return
}
// Start the tools completion
toolCalls, err := bob.ToolsCompletion()
if err != nil {
fmt.Println("Error:", err)
return
}
This will automatically add the list of tools to the agent:
agent.Params.Tools = agent.Tools
As well as the list of detected tool calls:
agent.ToolCalls = detectedToolCalls
And, once tool calls are detected, we can trigger the execution of associated functions:
// Display the list of the tool calls
toolCallsJSON, _ := ToolCallsToJSONString(toolCalls)
fmt.Println("Tool Calls:\n", toolCallsJSON)
// Execution of the functions
results, err := bob.ExecuteToolCalls(map[string]func(any) (any, error){
"say_hello": func(args any) (any, error) {
name := args.(map[string]any)["name"].(string)
return fmt.Sprintf("👋 Hello, %s!", name), nil
},
"vulcan_salute": func(args any) (any, error) {
name := args.(map[string]any)["name"].(string)
return fmt.Sprintf("🖖 Live long and prosper, %s!", name), nil
},
})
if err != nil {
fmt.Println("Error:", err)
return
}
// Display the results
fmt.Println("\nResult of the tool calls execution:")
for _, result := range results {
fmt.Println(result)
}
When I run the example code, I'll get a result like this:
Tool Calls:
[
{
"function": {
"arguments": {
"name": "Bob"
},
"name": "say_hello"
},
"id": "yw4BoKlzw3UuUhqYplGZvqpIC76IwyFe"
},
{
"function": {
"arguments": {
"name": "James Kirk"
},
"name": "vulcan_salute"
},
"id": "sQRybAcQC6X8sIeepPdGAXHAubyEy5zZ"
},
{
"function": {
"arguments": {
"name": "Spock"
},
"name": "say_hello"
},
"id": "PhrySmExjKAQhO2oSBsJrttax1yyMYtx"
}
]
Result of the tool calls execution:
👋 Hello, Bob!
🖖 Live long and prosper, James Kirk!
👋 Hello, Spock!
So we've reached the end of Part 1, and we've developed a version of the "augmented LLM" pattern and even a "little bit more" since we offer the possibility to execute functions based on what has been detected in the prompt. This is therefore a first step towards the second part of this topic. In the next part, we'll see how to move from the "augmented LLM" to the "AI Agent" using the Docker MCP Toolkit.
Part 2: AI Agent
I'll greatly shorten the definition (feel free to read Anthropic's documentation on the subject in full). Agents can handle complex tasks (more complex than what we saw in Part 1), but in fact their implementation is often simple, with complexity decoupled through the use of external tools. For example, it can be an "augmented LLM" with advanced Tools features allowing it to interact with its environment and provide a feedback loop:
flowchart LR
Human((Human)) <-.-> LLM[LLM Call]
LLM -->|Action| Environment((Environment))
Environment -->|Feedback| LLM
LLM -.-> Stop[Stop]
classDef human fill:#FFDDDD,stroke:#DD8888
classDef llm fill:#DDFFDD,stroke:#88DD88
classDef stop fill:#DDDDFF,stroke:#8888DD
classDef env fill:#FFDDDD,stroke:#DD8888
class Human human
class LLM llm
class Stop stop
class Environment env
To definitely transform our "augmented LLM" into an "AI Agent", I propose adding MCP support (as a client of an MCP server) and using Docker MCP Toolkit to connect to various MCP servers available through this same toolkit. Feel free to read my two previous blog posts: Boosting Docker Model Runner with Docker MCP Toolkit, and Hybrid Prompts with Docker Model Runner and the MCP Toolkit. I'll use what I describe in these 2 posts to finalize our "AI Agent".
1. Hybrid Agent Structure
This new structure offers two ways to execute tools:
Direct approach: Via the
Tools
andToolCalls
fields for internal tool managementDelegated approach: Via the
mcpClient
andmcpCmd
fields for access to an external MCP client
type Agent struct {
ctx context.Context
dmrClient openai.Client
Params openai.ChatCompletionNewParams
Tools []openai.ChatCompletionToolParam
ToolCalls []openai.ChatCompletionMessageToolCall
mcpClient *mcp_golang.Client
mcpCmd *exec.Cmd
lastError error
}
This approach provides maximum flexibility: the agent can use locally implemented tools or external tools via MCP.
2. Expanded Configuration Options
The code now allows MCP client configuration:
WithMCPClient
to initialize the MCP client and connect to Docker MCP ToolkitWithMCPTools
to filter the tools we want to use.
type STDIOCommandOption []string
func WithDockerMCPToolkit() STDIOCommandOption {
return STDIOCommandOption{
"docker",
"run",
"-i",
"--rm",
"alpine/socat",
"STDIO",
"TCP:host.docker.internal:8811",
}
}
func WithSocatMCPToolkit() STDIOCommandOption {
return STDIOCommandOption{
"socat",
"STDIO",
"TCP:host.docker.internal:8811",
}
}
Explanation of STDIO Connection Options in Robby
STDIOCommandOption
is simply a type alias for an array of strings. This type represents the command line arguments used to establish a connection between the agent and an MCP server.
The WithDockerMCPToolkit()
Method
func WithDockerMCPToolkit() STDIOCommandOption {
return STDIOCommandOption{
"docker",
"run",
"-i",
"--rm",
"alpine/socat",
"STDIO",
"TCP:host.docker.internal:8811",
}
}
This function returns a configuration to establish a connection to an MCP server using the Docker MCP Toolkit. Here's the detail of each element:
"docker"
: The Docker command itself"run"
: Subcommand to run a container"-i"
: Flag for interactive mode, allowing I/O via STDIN"--rm"
: Automatically removes the container after use"alpine/socat"
: Docker image containing the socat utility based on Alpine Linux"STDIO"
: First argument to socat, configures standard I/O"TCP:host.docker.internal:8811"
: Second argument to socat, establishes a TCP connection to the MCP server
This approach uses the socat
(Socket CAT) utility to establish a bridge between:
The standard input/output (STDIO) of the process
A TCP connection to
host.docker.internal:8811
(special address recognized in Docker to reference the host)
The MCP server (the Docker MCP Toolkit actually, which acts as a proxy) listens on port 8811, and this configuration allows the agent to communicate with it through an isolated Docker container.
The WithSocatMCPToolkit()
Method
func WithSocatMCPToolkit() STDIOCommandOption {
return STDIOCommandOption{
"socat",
"STDIO",
"TCP:host.docker.internal:8811",
}
}
This function returns a configuration to establish a direct connection to the Docker MCP Toolkit using the socat
utility installed locally on the host machine. It can be useful if you're running the agent in a container, for example, and want to avoid Docker in Docker. However, you'll still need to install/provide the socat
utility.
Usage in the Global Architecture: WithMCPClient
These options are used with the WithMCPClient
function which will initialize the MCP client:
func WithMCPClient(command STDIOCommandOption) AgentOption {
return func(agent *Agent) {
cmd := exec.Command(
command[0],
command[1:]...,
)
stdin, err := cmd.StdinPipe()
if err != nil {
agent.lastError = fmt.Errorf("failed to get stdin pipe: %v", err)
return
}
stdout, err := cmd.StdoutPipe()
if err != nil {
agent.lastError = fmt.Errorf("failed to get stdout pipe: %v", err)
return
}
if err := cmd.Start(); err != nil {
agent.lastError = fmt.Errorf("failed to start server: %v", err)
return
}
clientTransport := stdio.NewStdioServerTransportWithIO(stdout, stdin)
mcpClient := mcp_golang.NewClient(clientTransport)
if _, err := mcpClient.Initialize(agent.ctx); err != nil {
agent.lastError = fmt.Errorf("failed to initialize client: %v", err)
return
}
agent.mcpClient = mcpClient
agent.mcpCmd = cmd
}
}
Note
You can use the connection option like this WithMCPClient(WithDockerMCPToolkit())
or WithMCPClient(WithSocatMCPToolkit())
, but also by specifying your own command with its arguments:
WithMCPClient([]string{
"docker",
"run",
"-i",
"--rm",
"alpine/socat",
"STDIO",
"TCP:host.docker.internal:8811",
}),
So theoretically, you can connect to other MCP servers, not necessarily provided by the Docker MCP Toolkit.
Usage of: WithMCPTools
This option allows filtering the tools offered by the Docker MCP Toolkit, which can be numerous, and thus keeping only those needed and converting them to OpenAI format. This will also make it easier for the LLM as it will use a lighter catalog and won't have to go through too large a catalog of tools.
func WithMCPTools(tools []string) AgentOption {
return func(agent *Agent) {
// Get the tools from the MCP client
mcpTools, err := agent.mcpClient.ListTools(agent.ctx, nil)
if err != nil {
agent.lastError = err
return
}
// Convert the tools to OpenAI format
filteredTools := []mcp_golang.ToolRetType{}
for _, tool := range mcpTools.Tools {
for _, t := range tools {
if tool.Name == t {
filteredTools = append(filteredTools, tool)
}
}
}
agent.Tools = convertToOpenAITools(filteredTools)
}
}
The process is done in 3 steps:
Discovery: Retrieving all available tools via the MCP client
Filtering: Selecting tools specified by name in the provided list
Conversion: Transforming MCP tools into OpenAI-compatible format
And here's the code for the helper to convert the tools:
func convertToOpenAITools(tools []mcp_golang.ToolRetType) []openai.ChatCompletionToolParam {
openAITools := make([]openai.ChatCompletionToolParam, len(tools))
for i, tool := range tools {
schema := tool.InputSchema.(map[string]any)
openAITools[i] = openai.ChatCompletionToolParam{
Function: openai.FunctionDefinitionParam{
Name: tool.Name,
Description: openai.String(*tool.Description),
Parameters: openai.FunctionParameters{
"type": "object",
"properties": schema["properties"],
"required": schema["required"],
},
},
}
}
return openAITools
}
This function establishes a correspondence between:
The proprietary format of MCP tools
The standard format of OpenAI tools
3. Dual Tool Call Execution Mechanism
Now we have a second method for executing tool calls: ExecuteMCPToolCalls()
func (agent *Agent) ExecuteToolCalls(toolsImpl map[string]func(any) (any, error)) ([]string, error) { ... }
func (agent *Agent) ExecuteMCPToolCalls() ([]string, error) { ... }
Implementation of ExecuteMCPToolCalls()
func (agent *Agent) ExecuteMCPToolCalls() ([]string, error) {
responses := []string{}
for _, toolCall := range agent.ToolCalls {
var args map[string]any
err := json.Unmarshal([]byte(toolCall.Function.Arguments), &args)
if err != nil {
return nil, err
}
// Call the tool with the arguments thanks to the MCP client
toolResponse, err := agent.mcpClient.CallTool(agent.ctx, toolCall.Function.Name, args)
if err != nil {
responses = append(responses, fmt.Sprintf("%v", err))
} else {
if toolResponse != nil && len(toolResponse.Content) > 0 && toolResponse.Content[0].TextContent != nil {
agent.Params.Messages = append(
agent.Params.Messages,
openai.ToolMessage(
toolResponse.Content[0].TextContent.Text,
toolCall.ID,
),
)
responses = append(responses, toolResponse.Content[0].TextContent.Text)
}
}
}
if len(responses) == 0 {
return nil, errors.New("no tool responses found")
}
return responses, nil
}
Comparison of the Two Approaches
ExecuteToolCalls | ExecuteMCPToolCalls |
Executes tools via local Go functions | Executes tools via the MCP client |
Receives implementations as a map | Uses implementations registered in the MCP |
More flexible and controllable | More integrated with the MCP ecosystem |
Ideal for testing and isolated environments | Ideal for production with external tools |
Common Points
Both methods:
Iterate through the
ToolCalls
stored in the agentDeserialize JSON arguments
Execute corresponding tool functions
Collect responses
Update conversation history with tool results
You can find the updated code here: https://github.com/sea-monkeys/robby/blob/building_blocks_mcp_agent/robby.go
To conclude, here's an example of usage, using the Brave MCP server (I explain how to use it in the previous blog post: Hybrid Prompts with Docker Model Runner and the MCP Toolkit).
Example
This example runs an AI agent named bob
designed to perform web searches on two types of pizzas:
bob, err := NewAgent(
WithDMRClient(
context.Background(),
"http://model-runner.docker.internal/engines/llama.cpp/v1/",
),
WithParams(
openai.ChatCompletionNewParams{
Model: "ai/qwen2.5:latest",
Messages: []openai.ChatCompletionMessageParamUnion{
openai.UserMessage(`
Search information about Hawaiian pizza.(only 3 results)
Search information about Mexican pizza.(only 3 results)
`),
},
Temperature: openai.Opt(0.0),
ParallelToolCalls: openai.Bool(true),
},
),
WithMCPClient(WithDockerMCPToolkit()),
WithMCPTools([]string{"brave_web_search"}),
)
if err != nil {
fmt.Println("Error:", err)
return
}
toolCalls, err := bob.ToolsCompletion()
if err != nil {
fmt.Println("Error:", err)
return
}
toolCallsJSON, _ := ToolCallsToJSONString(toolCalls)
fmt.Println("Tool Calls:\n", toolCallsJSON)
if len(toolCalls) == 0 {
fmt.Println("No tools found.")
return
}
results, err := bob.ExecuteMCPToolCalls()
if err != nil {
fmt.Println("Error:", err)
return
}
fmt.Println("\nResult of the MCP tool calls execution:")
for idx, result := range results {
fmt.Println(fmt.Sprintf("%d.", idx), result)
}
And when I run it, I would get something like this:
Tool Calls:
[
{
"function": {
"arguments": {
"count": 3,
"offset": 0,
"query": "Hawaiian pizza"
},
"name": "brave_web_search"
},
"id": "24dkGIjlodQGKhkTHPHNyd82MKtCXhMz"
},
{
"function": {
"arguments": {
"count": 3,
"offset": 0,
"query": "Mexican pizza"
},
"name": "brave_web_search"
},
"id": "vRU0jszrJI3mtbQCKc1ZUNuzS5hscu2T"
}
]
Result of the MCP tool calls execution:
0. Title: Hawaiian pizza - Wikipedia
Description: <strong>Hawaiian</strong> <strong>pizza</strong> is a <strong>pizza</strong> invented in Canada, topped with pineapple, tomato sauce, mozzarella cheese, and either ham or bacon. Sam Panopoulos, a Greek-born Canadian, created the first <strong>Hawaiian</strong> <strong>pizza</strong> at the Satellite Restaurant in Chatham-Kent, Ontario, Canada, in 1962.
URL: https://en.wikipedia.org/wiki/Hawaiian_pizza
Title: Hawaiian Pizza - Sally's Baking Addiction
Description: Classic <strong>Hawaiian</strong> <strong>Pizza</strong> combines cheese, cooked ham, pineapple and bacon. To take your <strong>pizza</strong> to the next level, use this homemade <strong>pizza</strong> crust!
URL: https://sallysbakingaddiction.com/hawaiian-pizza/
Title: Easy Hawaiian Pizza - Cooking For My Soul
Description: This cheesy and delicious homemade <strong>Hawaiian</strong> <strong>pizza</strong> is so easy to make at home. Made with fresh pineapples, ham, and oregano!
URL: https://cookingformysoul.com/hawaiian-pizza/
1. Title: Easy Mexican Pizza - Cooking in the Midwest
Description: <strong>Mexican</strong> <strong>Pizzas</strong> are easy to make and so good! These are a Taco Bell Copy Cat recipe and perfect for switching up Taco Tuesday!
URL: https://cookinginthemidwest.com/blog/easy-mexican-pizza/
Title: Mexican Pizza | Taco Bell®
Description: Order a <strong>Mexican</strong> <strong>Pizza</strong> online at Taco Bell for delivery or pick-up near you.
URL: https://www.tacobell.com/food/specialties/mexican-pizza
Title: Mexican Pizza Recipe
Description: The pizzas are <strong>topped with layers of salsa, Monterey Jack and Cheddar cheeses, tomatoes, green onion and jalapeño slices</strong>. These homemade Mexican pizzas are best served immediately so the tortillas stay crisp. However, if you have leftovers, you can wrap them in foil (or place them in an airtight ...
URL: https://www.allrecipes.com/recipe/53075/jimmys-mexican-pizza/
So we can clearly see that the agent was able to search the Internet using the Brave MCP server. 🎉
Conclusion
We were able to simply implement an AI Agent with a "LLM + tools" architecture having the following capabilities:
Perception: The agent receives a user request
Decision: The language model decides which tools to use and how
Action: Tools are executed to interact with the external world (web search)
Result: The information obtained is returned to the user
This operation therefore extends the capabilities of the LLM by giving it (with the help of the application and MCP client) access to external and up-to-date information sources.
That's it for now. My goal was to make a minimal implementation of an AI Agent that isn't complex. In the future, I would add some features like a RAG system and also a "looping" tool completion method to allow the use of tool results by other tools, as I describe in this post: Hybrid Prompts with Docker Model Runner and the MCP Toolkit. We'll see in the future, depending on use cases, if other features need to be added.
And here's the birth of yet another side project 😂 https://github.com/sea-monkeys/robby 🤖
Subscribe to my newsletter
Read articles from Philippe Charrière directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
