I recently published an article on how to build an MCP server in C# (https://tjgokken.com/building-your-first-model-context-protocol-server-in-c).

On that article, we built a simple MCP-style server. It remembered things, stored context in memory. There were no fancy embeddings, no vector databases. No orchestration layer with GitHub stars and a vague README file.

And yet it was still MCP.

MCP has a lot of hype right now. It’s all over dev social media, conference talks, blog posts, diagrams with arrows and embeddings flying everywhere. If you just landed here, you’d think MCP is the final evolution of software design.

Yet, if you take a look at that article, you’ll see that it is actually an API app. It is not new tech, it’s a smart pattern and a structured glue.

Where MCP really shines though is standardizing how apps talk to LLMs — regardless of which model is doing the talking.

Here’s the usual situation when working with LLMs: every model has its own quirks. Different APIs, different input formats, different auth headers, different expectations on prompt formatting. So what happens? Every dev ends up writing their own little wrappers and helper classes just to make one call.

Now multiply that across multiple apps, multiple teams, and multiple models… and you’ve got a mess. To make things worse, usually different LLMs are good at different things. One can be good at coding, the other in research, and another one in creative writing. You get the idea.

What MCP says is: “You know what? Let’s wrap all that once — at the server level — and expose a clean, context-aware interface to the rest of the app.”

Your frontend or consuming service doesn’t need to know whether it’s talking to OpenAI, Claude, Mistral, or your secret fine-tuned llama on localhost. The MCP server handles that.

flowchart TD
    A[Client App] -->|Unified MCP Request| B[MCP Server]

    B -->|LLM Prompt + Context| C1[OpenAI GPT-4]
    B -->|LLM Prompt + Context| C2[Anthropic Claude]
    B -->|LLM Prompt + Context| C3[Local Ollama Model]

    C1 -->|Response| B
    C2 -->|Response| B
    C3 -->|Response| B

    B -->|Simplified Response| A

    subgraph External LLMs
        C1
        C2
        C3
    end

So now instead of saying:

jsonCopy{
  "model": "gpt-4",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "What’s the weather like in Sydney?" }
  ],
  "temperature": 0.7,
  ...
}

You just say:

jsonCopy{
  "type": "weather",
  "query": "Sydney",
  "sessionId": "abc123"
}

And that’s the power of MCP: your app doesn’t need to learn how every LLM works. Someone (the MCP server developer) already did that and they baked it into the MCP Server.

If that sounds like it is smart, it’s because it is.

If that sounds like a good architecture, it’s because it is.

If it sounds like this could be the future… Well, let’s not get ahead of ourselves. I am sure by this time next year, we’ll be talking about the next thing that’s going to change everything. And the year after that? Something else again.

You get the idea.

But for now? Let’s enjoy the moment — and build some MCP servers.

MCP Server: Standardizing LLM API Calls

Subscribe to my newsletter

TJ Gokken

TJ Gokken