Building an MCP Client 101: Let’s Build One for a Gemini Chat Agent

Paul FruitfulPaul Fruitful
10 min read

🚀Day 52 of #100daysofAIEngineering

After a 20-day break (thanks to work and some unforeseen chaos 😅), we’re officially back and stronger than ever! This time, with a new twist: we’re diving into building in the world of agents and agentic protocols.

Before the break, we were building MCP servers like everyone else. MCP servers are everywhere. Everyone’s building them, talking about them, deploying them.
But… what about MCP clients?
No one’s really talking about how to build one.

That’s what we’re doing today.

In this post, you’ll learn how to build your own MCP client, something your AI agents and platforms can actually talk to. And guess what? building it is way easier than you think.

This guide will be more approachable, more practical, and easier to follow than the original docs. Let’s get building. 🚀

Setting Up the Project

To follow along, you’ll need Python (yes, I’m Python biased 😄) and uv a lightning-fast package manager for Python that makes managing environments and dependencies way smoother.

Step 1: Install Python

Make sure Python is installed on your machine. You can download it from:

Step 2: Install uv

Once Python is ready, install uv using the instructions below, based on your OS:

On Windows (PowerShell):

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

On Mac/Linux (Terminal):

curl -LsSf https://astral.sh/uv/install.sh | sh

Step 3: Initialize Your MCP Client Project

Run the following command to scaffold a new project:

uv init gemini-mcp-client
cd gemini-mcp-client

Now your project is set up and ready to go!


Step 4: Activate the Virtual Environment

Before installing any dependencies, activate your project’s virtual environment:

On Windows:

.venv\Scripts\activate

On Mac/Linux:

source .venv/bin/activate

The dependencies our MCP client project needs are:

  • mcp – This is the core library that contains all the classes and methods required for interacting with the MCP server.

  • python-dotenv – This allows the project to securely load environment variables (like API keys) from a .env file.

  • google-tool-agent – A Gemini-powered agent I built with built-in tool-calling capabilities. It makes calling MCP server tools seamless and intelligent.

Installing the MCP Client Dependencies

With UV installed, installing these dependencies is straightforward. Just run:

uv pip install mcp python-dotenv google-tool-agent

This will install all the required packages and set up your environment for building the client.

We’ve got this all sorted out, now let’s get to building out our first MCP client 🚀

🚀 Building the MCP Client

Let’s start by importing the necessary dependencies:

from dotenv import load_dotenv
from contextlib import AsyncExitStack
from typing import Optional
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from gemini_tool_agent.agent import Agent
import os

What Each Import Does

  • load_dotenv (from python-dotenv): Loads environment variables from a .env file into your Python environment. This is how we access sensitive keys like API tokens securely.

  • AsyncExitStack (from contextlib): A flexible context manager that allows us to programmatically manage multiple async context managers. Useful for ensuring all resources are properly cleaned up when the program exits.

  • ClientSession (from mcp): This manages the lifecycle of a client session and handles communication with the MCP server.

  • StdioServerParameters (from mcp): Loads parameters for connecting to a server using standard input/output, which is how our MCP server is designed to communicate.

  • stdio_client (from mcp.client.stdio): Establishes a connection to the MCP server over stdio using the parameters we've defined.

  • Agent (from google-tool-agent): This is our custom Gemini-powered agent with built-in tool-calling functionality. It allows the client to intelligently decide whether to respond directly to a user prompt or call a tool provided by the server.

Let’s now implement the MCP_CLIENT class, which will manage our client session, the Gemini-powered agent, and the lifecycle of our async operations.


load_dotenv()
api_key = os.environ.get("GEMINI_KEY")

class MCP_CLIENT:
    def __init__(self) -> None:
        self.session: Optional[ClientSession] = None
        self.exit = AsyncExitStack()
        self.agent = Agent(api_key)

Breakdown of the Class Properties

Each property in this class plays a vital role in ensuring our client runs smoothly:

  • self.session: Holds the current MCP client session. This is where all communication with the MCP server happens.

  • self.exit: Stores our AsyncExitStack, which helps manage asynchronous context manager and ensures clean resource teardown when the session ends.

  • self.agent: Stores an instance of our custom Gemini-powered agent (Agent) This allows the client to intelligently handle inputs and call tools provided by the MCP server.

Next, we implement the method that brings our MCP client to life, connect_mcp_server()

This method is responsible for establishing a connection between the client and the MCP server. Most of the servers we’ll be working with communicate via stdio (standard input/output), so we’ll be using the stdio_client method provided by the MCP library to make the connection.

Here’s the full implementation of our connect_mcp_server method:


    async def connect_mcp_server(self, server_script_path):
        is_python = server_script_path.endswith('.py')
        is_js = server_script_path.endswith('.js')
        if not (is_python or is_js):
            raise ValueError("Server script must be a .py or .js file")

        cmd = "python" if is_python else "node"
        server = await self.exit.enter_async_context(
            stdio_client(
                StdioServerParameters(
                    command=cmd,
                    args=[server_script_path],
                    env=None,
                )
            )
        )
        self.stdio, self.write = server
        self.session = await self.exit.enter_async_context(ClientSession(self.stdio, self.write))
        await self.session.initialize()

        response = await self.session.list_tools()
        tools = [{
            "name": tool.name,
            "description": tool.description,
            "input_schema": tool.inputSchema
        } for tool in response.tools]

        self.agent.tools = tools
        print("\nConnected to server with tools:", [tool["name"] for tool in tools])

What’s Going On Here?

  1. Validation:
    We check if the provided server file is a .py or .js script. If it's not, we throw an error. This is important because our client only knows how to run Python or Node.js servers.

  2. Command Selection:
    Based on the file extension, we determine whether to use python or node to execute the server.

  3. Server Connection:
    We use stdio_client to establish the connection and wrap it in AsyncExitStack using self.exit.enter_async_context(...) to handle proper resource cleanup.

  4. Destructuring Streams:
    The connection returns two streams: a read stream (self.stdio) and a write stream (self.write). These are used to communicate with the MCP server.

  5. Session Setup:
    With the read/write streams, we initialize a ClientSession and add that to the async stack too.

  6. Tool Discovery:
    Once the session is live, we call list_tools() to discover all available tools exposed by the server. We process the tool information into a list of dictionaries and assign it to the Agent so that it can use them.

At this point, our MCP client is connected, aware of the tools on the server, and ready to use them!

The next important method is get_response()

This method is what enables communication between the user and the MCP client.

It takes a prompt from the user, sends it through the Agent, and fetches a response. In the background, the agent figures out if a tool needs to be used, passes the appropriate values to it, and returns a meaningful response to the user.

Here’s the implementation of the get_response() method:

  async def get_response(self,input:str):
        try:
            response=self.agent.process_query(input)

            self.agent.history.append({"role": "user", "content": input})


            if isinstance(response, dict) and response.get("needs_tool", False):
                tool_name = response.get("tool_name", None)
                if tool_name:

                    tool_response=self.agent.process_use_tool(tool_name)

                    self.agent.history.append({"role": "assistant", "content": tool_response})

                    tool=tool_response["tool_name"]

                    call_tool=self.agent.process_use_tool(tool)

                    self.agent.history.append({"role": "process_tool_call", "content": call_tool})

                    result=await self.session.call_tool(tool,call_tool["input"]) 

                    self.agent.history.append({"role": "tool_call_result", "content": result})

            if isinstance(response, dict) and response.get("needs_direct_response", False):
                self.agent.history.append({"role": "direct_response", "content": response["direct_response"]})

                return response["direct_response"]
            else:
                conversation_context = self.agent.history[-5:] if len(self.agent.history) >= 5 else self.agent.history
                response_text = self.agent.generate_response(f"""
                You are a helpful assistant responding to the following query:
                QUERY: {input}

                CONVERSATION HISTORY: {conversation_context}

                Please provide a comprehensive and accurate response that considers the conversation history.
                """)
                self.agent.history.append({"role": "assistant", "content": response_text})
                return response_text

        except Exception as e:
            return f"An error occurred while processing your request: {str(e)}"

What Does This Method Do?

This method is the core of how users interact with your MCP client. Here’s the breakdown:

1. Process the Initial User Input

response = self.agent.process_query(input)
self.agent.history.append({"role": "user", "content": input})
  • We send the user’s prompt into the agent.

  • We store the prompt in the agent’s conversation history.

2. If a Tool Needs to Be Used…

if response.get("needs_tool", False):
    ...

If the response says a tool is required:

  • Get the tool name and pass it to process_use_tool to build the tool's usage context.

  • Log the tool response in the agent’s history.

  • Call the tool again with process_use_tool, now extracting actual input values to be used.

  • ️ Use self.session.call_tool(...) to make the actual call to the tool server with input values.

  • Log the tool call result.

3. If a Direct Response Is Available…

if response.get("needs_direct_response", False):
    return response["direct_response"]

If the agent already has a ready-made response (like a simple fact or predefined answer), we just return that.

4. Otherwise, Generate a Response Using Context

conversation_context = self.agent.history[-5:] if len(self.agent.history) >= 5 else self.agent.history
response_text = self.agent.generate_response(...)

If no tools are needed and there’s no direct response:

  • We pass the last 5 items in the conversation history to give the agent context.

  • Then we ask it to generate a smart and helpful answer using that context.

  • The result is saved to the agent’s history and returned to the user.

5. Error Handling

except Exception as e:
    return f"An error occurred while processing your request: {str(e)}"

If anything goes wrong during the process (invalid tool call, session failure, etc.), we catch it and return a friendly error message.

Next we implement the chat_loop()method which starts the conversation sessio with the user and keeps the continuous back and forth between the client and the User till the user exits.

Implementation:

async def chat_loop(self):
        """Main chat loop for interacting with the MCP server"""
        print("Chat session started. Type 'exit' to quit.")

        while True:
            try:
                user_input = input("\nYou: ").strip()

                if user_input.lower() == 'exit':
                    print("Ending chat session...")
                    break

                if not user_input:
                    continue
                try:
                 response = await self.get_response(user_input)
                except Exception as e:
                    print(f"\nError occurred: {str(e)}")
                    continue
                if response:
                    print("\nAssistant:", response)

            except Exception as e:
                print(f"\nError occurred: {str(e)}")
                continue

method which starts a conversation with the user and keeps the continuous back and forth between the client and the User till the user exits.

Now we are all set, but before we wrap it up, we need to implement a method that closes the current session and kills the connection.

Implementing the close() method

async def close(self):
        await self.exit.aclose()

And yay! our MCP client class is ready 😁

Here’s what the full code looks like:

from dotenv import load_dotenv
from contextlib import AsyncExitStack
from typing import Optional
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from gemini_tool_agent.agent import Agent
import os 

load_dotenv()

api_key=os.environ.get("GEMINI_KEY")

class MCP_CLIENT:
    def __init__(self):

        self.session: Optional[ClientSession] = None
        self.exit= AsyncExitStack()
        self.agent=Agent(api_key)

    async def connect_mcp_server(self,server_script_path):
        is_python = server_script_path.endswith('.py')
        is_js = server_script_path.endswith('.js')
        if not (is_python or is_js):
            raise ValueError("Server script must be a .py or .js file")

        cmd="python" if is_python else "node"
        server=await self.exit.enter_async_context(
            stdio_client(
                StdioServerParameters(
                    command=cmd,
                    args=[server_script_path],
                    env=None,
                )
            )
        )
        self.stdio,self.write=server
        self.session = await self.exit.enter_async_context(ClientSession(self.stdio, self.write))
        await self.session.initialize()
        response = await self.session.list_tools()
        tools = [{
        "name": tool.name,
        "description": tool.description,
        "input_schema": tool.inputSchema
        } for tool in response.tools
        ]
        self.agent.tools=tools
        print("\nConnected to server with tools:", [tool["name"] for tool in tools])

    async def get_response(self,input:str):
        try:
            response=self.agent.process_query(input)

            self.agent.history.append({"role": "user", "content": input})


            if isinstance(response, dict) and response.get("needs_tool", False):
                tool_name = response.get("tool_name", None)
                if tool_name:

                    tool_response=self.agent.process_use_tool(tool_name)

                    self.agent.history.append({"role": "assistant", "content": tool_response})

                    tool=tool_response["tool_name"]

                    call_tool=self.agent.process_use_tool(tool)

                    self.agent.history.append({"role": "process_tool_call", "content": call_tool})

                    result=await self.session.call_tool(tool,call_tool["input"]) 

                    self.agent.history.append({"role": "tool_call_result", "content": result})

            if isinstance(response, dict) and response.get("needs_direct_response", False):
                self.agent.history.append({"role": "direct_response", "content": response["direct_response"]})

                return response["direct_response"]
            else:
                conversation_context = self.agent.history[-5:] if len(self.agent.history) >= 5 else self.agent.history
                response_text = self.agent.generate_response(f"""
                You are a helpful assistant responding to the following query:
                QUERY: {input}

                CONVERSATION HISTORY: {conversation_context}

                Please provide a comprehensive and accurate response that considers the conversation history.
                """)
                self.agent.history.append({"role": "assistant", "content": response_text})
                return response_text

        except Exception as e:
            return f"An error occurred while processing your request: {str(e)}"


    async def chat_loop(self):
        """Main chat loop for interacting with the MCP server"""
        print("Chat session started. Type 'exit' to quit.")

        while True:
            try:
                user_input = input("\nYou: ").strip()

                if user_input.lower() == 'exit':
                    print("Ending chat session...")
                    break

                if not user_input:
                    continue
                try:
                 response = await self.get_response(user_input)
                except Exception as e:
                    print(f"\nError occurred: {str(e)}")
                    continue
                if response:
                    print("\nAssistant:", response)

            except Exception as e:
                print(f"\nError occurred: {str(e)}")
                continue

    async def close(self):
        await self.exit.aclose()

Next up, we use a main function to start the chat loop between our user and our Agent:

async def main():
    mcp_client = MCP_CLIENT()
    server_path = input("Enter the path to the server script: ")
    try:
        await mcp_client.connect_mcp_server(server_path)
        await mcp_client.chat_loop()
    finally:
        await mcp_client.close()
if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

And voilà! We now have a fully functional implementation of our first MCP client. 🎉

Let me digress a little 😅 Building this was easily one of the most exciting things I’ve done in the last 7 days.

That said, it wasn’t without its hurdles. One key challenge I ran into was that most MCP client documentation and compatibility seem to focus on Anthropic, while other models like Gemini don’t natively support tool use (yet).

This limitation inspired me to build the gemini-tool-agenta custom wrapper to bridge the gap and enable Gemini to work seamlessly with MCP-style tools. That decision made all the difference in bringing this project to life.

Stay tuned for more explorations

Day 52 was a fun one. 🚀

0
Subscribe to my newsletter

Read articles from Paul Fruitful directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Paul Fruitful
Paul Fruitful

Skilled and results-oriented Software Developer with more than 5 years of experience working in a variety of environments with a breadth of programs and technologies. I am open and enthusiastic about ideas, solutions and problems. I am proficient with PHP, JavaScript and Python and I am well vast in a lot of programming and computing paradigms.