This guide is for folks who are comfortable using LLM-powered apps like ChatGPT, Claude Desktop, or Perplexity, but they’re not sure what the whole MCP server fuss is about. Here, you’ll learn about MCP servers and see a real example of asking GitHub Copilot the same query with and without an MCP server running. The difference? 🌒 Night and ☀️ day.

You can also try it out yourself by setting up a minimal data stack using dlt and DuckDB, then leveraging the DuckDB MCP server with VSCode + GitHub Copilot for AI-powered SQL operations.

🏎

Want to skip straight to the guide? Here you go: https://github.com/foundinblank/mcp-data-sandbox

Introduction

What is an MCP server?

Model Context Protocol (MCP) servers are standardized interfaces (like REST APIs) that let large language models (LLMs) interact with external tools such as DuckDB or Figma or Chrome.

But I feel like my LLM app already works with other tools? 🤔

Many LLM-powered apps use proprietary solutions to work with other tools. For example, GitHub Copilot in VS Code can access your files and monitor your terminal. The ChatGPT desktop app for macOS can read content from apps like Notes or Terminal.

MCP is different. It’s open, programmable, and designed to be a standard. It gives LLMs a consistent way to integrate with external systems—without vendor lock-in. Data can flow freely between the LLM and the external tool, and the LLM can ask the external tool to perform actions (database queries; whiteboard mockups; email inbox maintenance).

So it’s like how USB makes it easier to plug things into other things?

Right. Think of it as a power-up 🍄 for your LLM. Without MCP, LLMs are boxed into their immediate environment. By using MCP servers, LLMs gain the ability to "agentically" access other tools, systems, or data sources.

It's similar to exposing a remote database through a REST API endpoint. Suddenly, your data becomes accessible to any app or user (with the right credentials), and the possibilities multiply.

Cool. Got an example for me to work with?

Yep! Use this guide: https://github.com/foundinblank/mcp-data-sandbox. Using the DuckDB MCP server, your LLM can power up 💪 with the ability to query DuckDB databases.

Querying Without And With An MCP Server

Once you’ve got your minimal data stack and DuckDB MCP server set up, you can then ask your LLM (my example is in VSCode + GitHub Copilot using Claude 3.7 Sonnet) to do something, anything with the data in your DuckDB file. Below are examples of how Copilot handles the same question without and with the DuckDB MCP server running: “How many schemas and tables do I have in my DuckDB file?”

Without DuckDB MCP Server 👎

It was a simple question, but the LLM was totally lost. And there were plenty of errors in the terminal too:

It was just stumbling around trying different things. As amazing as Claude 3.7 Sonnet is, it didn’t seem to really understand that there was an external tool out there containing the data I was interested in and that there was a correct way to interface with it.

With DuckDB MCP Server 👍

What a power-up! Copilot is much more sophisticated already. Using the MCP server, it was able to write queries, run them directly against the DuckDB engine, pull out the results, and summarize them for me.

I asked it next to “tell me some interesting information about the data in my DuckDB file” and I got this:

# Chess Database Insights
The database appears to contain primarily high-level chess games 
played by grandmaster-level players, with World Champion Magnus 
Carlsen being the most prominent player in the dataset.

## Overview
- **Total Games**: 1,179 chess games from Chess.com
- **Distinct Players**: 234 unique players

## Game Distribution
- **Blitz games**: 817 games (69%)
- **Bullet games**: 362 games (31%)

## Player Ratings
- **Average Rating**: ~3,062 for both white and black players
- **Rating Distribution**:
  | Rating Range | Games |
  |-------------|-------|
  | 3000+       | 759   |
  | 2500-2999   | 414   |
  | 2000-2499   | 3     |
  | 1500-1999   | 2     |
  | Under 1000  | 1     |

## Game Outcomes
| Outcome          | Count |
|------------------|-------|
| Win              | 574   |
| Resigned         | 316   |
| Timeout          | 110   |
| Repetition       | 76    |
| Insufficient     | 40    |
| Checkmated       | 32    |
| Other            | 31    |

## Player Accuracy
- **White pieces**: 85.7% average accuracy
- **Black pieces**: 84.9% average accuracy

## Notable Players
| Player           | Title | Followers |
|------------------|-------|-----------|
| Magnus Carlsen   | GM    | 262,141   |
| R Praggnanandhaa | GM    | 11,187    |
| Vincent Keymer   | GM    | 3,217     |

It was like the LLM knew exactly what to do! With the MCP server up and running, the LLM was aware of what DuckDB is, and how to interface with DuckDB files. The MCP server came with these instructions which the LLM clearly leveraged.

Try it yourself — here’s the guide: https://github.com/foundinblank/mcp-data-sandbox

Conclusion

In short, MCP servers give your LLMs cool power-ups ⤴️ to do whatever they want in other tools. Here are some links to explore more MCP servers:

MCP Servers for Dummies

Table of contents