MCP Server Deployment & Hosting: A Practical Guide through Docker & Kubernetes


You've built your MCP server and want to deploy it remotely or publish it as a package? This article covers your options from a DevOps viewpoint.
If you don't know what MCP servers are, see my previous articles:
In short, the Model Context Protocol (MCP) is an open standard developed by Anthropic, the creators of Claude LLM. It allows AI agents to use external tools to:
Get additional context, such as from an external API
Not only generate text but also interact with the external world, making actions possible
MCP connects an AI model to N external services.
You can also connect M models to N external services, which is for example the case when using GPT-4 for text generation, and DALL·E or Stable Diffusion for image generation.
In that case, MCP solves this MxN integration problem by reducing the complexity to M+N, where M AI models connecting to N external tools.
That’s amazing, isn’t it? Well depends if you successfully host your MCP servers or if they remain in the unused shadows.
We’ll go through the following deployment approaches:
Local hosting: Clone MCP servers on your machine and reference their entrypoint in your IDE’s MCP configuration file.
Remote hosting: Deploy your MCP server to cloud platforms as serverless or long-lived apps.
We’ll go through serverless limitations for that use case.
This makes your server accessible from anywhere with a public URL → perfect for team collaboration.Containerization & Kubernetes: For scalability purposes, you can host your MCP remotely on a Kubernetes cluster. We’ll see how to package your MCP server as a Docker container and deploy it.
Packaging as libraries: Publish your MCP server as a Python or npm package.
This is the easiest method if you want a privacy-first approach while still making it easy for others to install and use your tools.
Deployment Options
1. Local Hosting
You can reference your Python/JS MCP server entrypoint in your IDE's MCP configuration file.
Here is an exemple Claude Desktop configuration file:
// claude_desktop_config.json
{
"mcpServers": {
// For a Python MCP server, run with 'uv' or 'poetry'
"MyPythonMCP": {
"command": "uv",
"args": [
"--directory",
"/path/to/your/server/directory",
"run",
"server.py"
]
},
// For a JavaScript MCP server, run with 'node'
"MyJavaScriptMCP": {
"command": "node",
"args": [
"/path/to/your/server/index.js"
]
}
}
}
I recommend containerizing your MCP server so that others can run it securely with limited access to the filesystem, CPU, and memory. People should not run your Python/JS MCP server without caution; add safeguards by user Docker. We’ll see how to do that in the steps below.
2. Remote Hosting
A remote MCP can be deployed as any other web service. Two transport protocols are usually used: Server-Sent Events (SSE) and Streamable HTTP. SSE is legacy and must be replaced by Streamable HTTP.
To set up SSE, two endpoints are served.
/sse
: The client initilizes the connection by making a
GET
request to it. The server keeps the connection open, sending responses using SSE./messages
: The client sends messages throughPOST
requests. The server processes the messages and replies through the SSE connection.
This means the connection is kept open (long-lived connections) and cannot be used for ephemeral environments.
However, Streamable HTTP uses short-lived sessions only. It’s served through one unique /mcp
endpoint, which is perfect to deploy it to serverless environments.
2.a. How to use Streamable HTTP?
For FastMCP, use: mcp.run
(transport="http", host="127.0.0.1", port=8000)
For Cloudflare workers templates, specify /mcp
api handler:
import GitHubHandler from "./github-handler";
export default new OAuthProvider({
apiHandlers: {
// Deprecated SSE protocol - use /mcp instead
// "/sse": MyMCP.serveSSE("/sse"),
"/mcp": MyMCP.serve("mcp") // Streamable-HTTP protocol
},
defaultHandler: GitHubHandler,
authorizeEndpoint: "/authorize",
tokenEndpoint: "/token",
clientRegistrationEndpoint: "/register"
});
2.b. Common Remote Hosting Options
You can deploy your MCP using various platforms:
Render: Simple deployment with GitHub integration
Vercel: Great for JavaScript/TypeScript MCPs
Netlify: Excellent for frontend-focused MCPs
Cloudflare Workers: Good for JavaScript MCPs without Node dependencies. Python MCPs are also supported, but not on the Free plan due to file size limits.
When using these cloud platforms, you host your Python or JavaScript app directly, not the containerized version. The platforms already manage resource limitations for you, no worries.
Before deploying your server, make sure to add logging. It records user activities with unique correlation IDs. Tracing will be your best friend for debugging.
2.c. Add Tracing to Python MCP servers
You can use mcp-trace from ContexaAI with FastMCP.
Log to any of the supported backend: console, file, Contexa, PostgreSQL, Supabase or multiple of them at the same time:
import os
class MultiAdapter:
def __init__(self, *adapters):
self.adapters = adapters
def export(self, trace_data: dict):
for adapter in self.adapters:
adapter.export(trace_data)
contexa_adapter = ContexaTraceAdapter(
api_key=os.getenv("CONTEXA_API_KEY"),
server_id=os.getenv("CONTEXA_SERVER_ID"),
)
file_adapter = FileTraceAdapter("trace.log")
psql_adapter = PostgresTraceAdapter(
dsn=os.getenv("POSTGRES_DSN") # example: postgresql://user:pass@host:port/dbname
)
supabase_adapter = SupabasePostgresTraceAdapter(os.getenv("SUPABASE_URL"))
console_adapter = ConsoleTraceAdapter()
trace_middleware = TraceMiddleware(
adapter=MultiAdapter(
contexa_adapter,
file_adapter,
psql_adapter,
supabase_adapter,
console_adapter,
)
)
mcp.add_middleware(trace_middleware)
If you use PostgreSQL Adapter, create a table to store traces:
CREATE TABLE mcp_traces (
id SERIAL PRIMARY KEY,
timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW(),
session_id TEXT NOT NULL,
trace_data JSONB NOT NULL
);
I strongly advise to configure tracing to exclude sensitive data to ensure privacy and avoid leaking others’ data/credentials:
trace_middleware = TraceMiddleware(
adapter=trace_adapter,
log_fields={
"entity_name": True,
"entity_response": True,
"entity_params": False, # disables tool arguments
"client_id": False, # disables client_id
# ...add more as needed
}
)
mcp.add_middleware(trace_middleware)
2.d. Add tracing to a TypeScript MCP Servers
You can use mcp-trace-js from ContexaAI.
Log to any of the supported backend: console, file, Contexa, PostgreSQL, Supabase or multiple of them at the same time:
import {
ContexaTraceAdapter,
FileAdapter,
PostgresTraceAdapter,
SupabaseTraceAdapter,
MultiAdapter,
TraceMiddleware,
} from "mcp-trace-js";
let traceMiddleware;
if (process.env.NODE_ENV === "production") {
const contexaAdapter = new ContexaTraceAdapter({
apiKey: process.env.CONTEXA_API_KEY,
serverId: process.env.CONTEXA_SERVER_ID
// Optional: apiUrl, bufferSize, flushInterval, maxRetries, retryDelay
});
const fileAdapter = new FileAdapter(
process.env.TRACE_LOG_FILE || "trace.log"
);
const psqlAdapter = new PostgresTraceAdapter({
dsn: process.env.POSTGRES_DSN // e.g. postgresql://user:pass@host:port/dbname
});
const supabaseAdapter = new SupabaseTraceAdapter({
supabaseClient: process.env.SUPABASE_URL // or client object if you init it elsewhere
});
const multiAdapter = new MultiAdapter(
contexaAdapter,
fileAdapter,
psqlAdapter,
supabaseAdapter
);
traceMiddleware = new TraceMiddleware({ adapter: multiAdapter });
app.use("/mcp", traceMiddleware.express());
}
Same here, if you use PostgreSQL Adapter, create a table to store traces:
CREATE TABLE IF NOT EXISTS trace_events (
id SERIAL PRIMARY KEY,
timestamp TIMESTAMPTZ NOT NULL,
type TEXT NOT NULL,
method TEXT,
session_id TEXT NOT NULL,
client_id TEXT,
duration INTEGER,
entity_name TEXT,
arguments JSONB,
response TEXT,
error TEXT
);
You should aso exclude sensitive data here:
const traceMiddleware = new TraceMiddleware({
adapter: traceAdapter,
logFields: {
tool_name: true,
tool_response: true,
tool_arguments: false, // disables tool arguments
client_id: false, // disables client_id
// ...add more as needed
},
});
You can skip tracing for specific requests by adding the X-Ignore-Traces
header:
# Skip tracing for this request
curl -H "X-Ignore-Traces: true" http://localhost:8080/mcp
2.e. Add tracing to Cloudflare Workers MCP Servers
When using Cloudflare Workers, you may pass your MCP handlers to an OAuthProvider. To add tracing, wrap these handlers:
import { ConsoleAdapter, TraceMiddleware } from "mcp-trace-js";
const trace = new TraceMiddleware({ adapter: new ConsoleAdapter() });
// Instead of using the middleware, use trace low-level function.
const tracedServe = (path: string) => {
const handler = MyMCP.serve(path);
return {
async fetch(req: Request, env: Env, ctx: ExecutionContext) {
const span = trace.startSpan({ path });
try {
const res = await handler.fetch(req, env, ctx);
trace.endSpan(span, { status: res.status });
return res;
} catch (err) {
trace.endSpan(span, { error: err });
throw err;
}
}
};
};
export default new OAuthProvider({
apiHandlers: {
"/mcp": tracedServe("/mcp"),
},
...
});
2.f. Deploying to Render Example
Create a new Web Service on Render using GitHub integration by selecting your code repository.
Enter service details:
Name:
My-Custom-MCP
Build Command:
npm install && npm run build
Start Command:
npm start
You'll get a public URL like: https://my-custom-mcp-j3jc.onrender.com
Configure your IDE to use the remote MCP:
// claude_desktop_config.json
{
"mcpServers": {
"my_custom_mcp": {
"command": "npx",
"args": ["mcp-remote","https://my-custom-mcp-j3jc.onrender.com"]
}
}
}
3. Python MCP Server Containerization
If you self-host your remote MCP, containerize it to:
limit its filesystem access
limit its CPU and memory usage
You would also want to containerize your MCP server for scalability.
The basic idea of a Dockerfile is that it’s a list of operations:
Start from a base image
FROM python:3.11-slim
You start from an initial operations list, which are called base images.
Here we use the slim Python base image
python:3.11-slim
for its light size.Create a non-root user named
mcpuser
RUN groupadd -g 1000 mcpuser && \ useradd -u 1000 -g mcpuser -s /bin/bash -m mcpuser
To prevent the container from accessing all files, we run it as a non-root user.
The created user is identified by its UID & GID, here 1000 and 1000 respectively.Install Python dependencies
Copy the
requirements.txt
fileInstall dependencies:
RUN pip install --no-cache-dir -r requirements.txt
Copy your app’s source code
Copy the rest of the files and let the created user modify them:
COPY --chown=mcpuser:mcpuser . .
Run your MCP server entrypoint:
CMD ["python", "server.py"]
Leverage caching by using multi-stage dockerfiles.
The first stage installs dependencies in/install
.
Then the second stage copy installed dependencies.
→ Untilrequirements.txt
change, the first stage will not run again.
Let’s put all that together into one Dockerfile
file, next to your server.py
file:
# Start with a Python base image
FROM python:3.11-slim AS builder
# Set working directory
WORKDIR /app
# Install build tools
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
&& rm -rf /var/lib/apt/lists/*
# Create a non-root user with UID & GID 1000
RUN groupadd -g 1000 mcpuser && \
useradd -u 1000 -g mcpuser -s /bin/bash -m mcpuser
# Copy requirements first for better caching
COPY requirements.txt .
# Install dependencies into /install
RUN python -m venv /install && \
/install/bin/pip install --no-cache-dir -r requirements.txt
# Final image
FROM python:3.11-slim
WORKDIR /app
# Create a non-root user
RUN groupadd -g 1000 mcpuser && \
useradd -u 1000 -g mcpuser -s /bin/bash -m mcpuser
# Copy installed dependencies from builder
COPY --from=builder /install /install
ENV PATH="/install/bin:$PATH"
# Copy application code with ownership
COPY --chown=mcpuser:mcpuser . .
# If you have a config file, uncomment the below line
# Mark mounted files as read-only
# VOLUME ["/app/config"]
# Environment variables
ENV PORT=8080
ENV LOG_LEVEL=INFO
# Switch to non-root user
USER mcpuser
# Expose port
EXPOSE 8000
# Run the server
CMD ["python", "server.py"]
- Let’s build and run your containerized MCP server:
docker build -t my-mcp-server . && docker run -it mcp-server
- You can use MCP inspector to debug your docker container:
npx @modelcontextprotocol/inspector serve docker run -i my-mcp-server
- Publish your docker image to an image registry (here Docker Hub)
# Tag and Publish the Docker image to Docker Hub
# Replace 'your_dockerhub_username' and 'your_repository_name' accordingly.
docker tag my-mcp-server your_dockerhub_username/your_repository_name:latest
# Log in to Docker Hub (you'll be prompted for your username and password)
docker login
# Push the Docker image to Docker Hub
docker push your_dockerhub_username/your_repository_name:latest
Add health endpoints & self-healing to your container
Docker marks a container as
running
as long as its main process is active. But an active process does not guarantee the application is healthy. A service might be:Unresponsive because of an internal error
Waiting for a dependency such as a database
Still starting up and not yet ready to handle traffic
Let’s solve this with docker-compose, which has a health check built-in mechanism.
If you’re not familiar with docker-compose, it allows to define and manage multi-container Docker applications using a single YAML file.
Create a docker-compose.yml
file to run your MCP server:
services:
mcp-server:
image: <your_docker_hub_username>/<repo>:latest
build:
context: .
dockerfile: Dockerfile
ports:
- "8000:8000"
environment:
- LOG_LEVEL=INFO
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 5s
restart: unless-stopped
autoheal:
restart: always
image: willfarrell/autoheal
environment:
- AUTOHEAL_CONTAINER_LABEL=all
volumes:
- /var/run/docker.sock:/var/run/docker.sock
The autoheal
container will ensure unealthy containers get restarted using willfarrell/autoheal.
Use the following commands to run/build/push:
docker-compose up
# Run the image in detached mode:
docker-compose up -d
# Force to rebuild the image:
docker-compose up --build
# Build-only:
docker-compose build
# Push to Docker hub
docker-compose push
4. TypeScript MCP Server Containerization
The steps are essentialy the same, except for:
the base iamge: use a
node
image from Docker Hub. Usually, check your current node version usingnode —version
and pick the one matching on Docker Hub.The dependencies installation: We use
npm ci
to install them from the package files.FastMCP listens on port 8000 while @modelcontextprotocol typescript SDK listens on port 3000 by default.
# Start with node base image FROM node:22.12-alpine AS builder WORKDIR /app # Create non-root user RUN addgroup -g 1000 mcpuser && \ adduser -D -u 1000 -G mcpuser mcpuser # Copy package files and leverage npm caching COPY package*.json ./ RUN --mount=type=cache,target=/root/.npm \ npm ci # Copy application code COPY --chown=mcpuser:mcpuser . . # Build the app RUN npm run build # === Stage 2: Release image === FROM node:22-alpine AS release WORKDIR /app # Copy built files and package info from builder COPY --from=builder /app/dist /app/dist COPY --from=builder /app/package*.json /app/ # Set environment ENV NODE_ENV=production # Install only production dependencies RUN --mount=type=cache,target=/root/.npm \ npm ci --omit=dev --ignore-scripts # Create non-root user RUN addgroup -g 1000 mcpuser && \ adduser -D -u 1000 -G mcpuser mcpuser USER mcpuser # Mark mounted files as read-only VOLUME ["/app/config"] # Expose port EXPOSE 3000 # Run the server ENTRYPOINT ["node", "dist/index.js"]
You can also use a docker-compose file for auto-heal:
services: mcp-server: image: <your_docker_hub_username>/<repo>:latest build: context: . dockerfile: Dockerfile ports: - "8000:3000" environment: - LOG_LEVEL=INFO healthcheck: test: ["CMD", "curl", "-f", "http://localhost:3000/health"] interval: 30s timeout: 10s retries: 3 start_period: 5s restart: unless-stopped autoheal: restart: always image: willfarrell/autoheal environment: - AUTOHEAL_CONTAINER_LABEL=all volumes: - /var/run/docker.sock:/var/run/docker.sock
5. Kubernetes Deployment
To ensure scalability, you may want to run your containers on a Kubernetes clusters.
We’ll create the following Kubernetes manifests:
A deployment, which describes your application’s lifecycles, how many pods run…
A pod is a group of one or more containers.A service, which allows to access a set of Pods from a stable network endpoint. Even if the pods get removed, the service endpoint remains.
An ingress, in order to route traffic from outside to the internal service.
In the below manifests, replace <MCP Server PORT>
by 3000
for typescript MCP servers, and by 8000
for Python FastMCP servers.
Create a Kubernetes deployment manifest
Create a
deployment.yaml
file which creates 2 replicas of yourmcp-server
Docker image. Note the pods label.apiVersion: apps/v1 kind: Deployment metadata: name: mcp-server namespace: mcp-server spec: replicas: 2 selector: matchLabels: app: mcp-server template: metadata: labels: app: mcp-server spec: containers: - name: mcp-server image: your-registry/mcp-server:latest ports: - containerPort: <MCP Server PORT> envFrom: - secretRef: name: mcp-server-secrets volumeMounts: - name: config-volume mountPath: /app/config readOnly: true volumes: - name: config-volume configMap: name: mcp-server-config
Monitor your pods
If you use Prometheus operator, you can monitor your pods using PodMonitor or Service Monitor.
Implement a custom route/metrics
, then create a Pod Monitor manifest inpod-monitor.yaml
:apiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: mcp-server-pod-monitor namespace: mcp-server spec: selector: matchLabels: app: mcp-server podMetricsEndpoints: - port: metrics path: /metrics
Expose your MCP server with a Kubernetes service
Create aservice.yaml
file. Define a ClusterIP servie that target our pods label, i.e.mcp-server
apiVersion: v1 kind: Service metadata: name: mcp-server namespace: mcp-server spec: selector: app: mcp-server ports: - port: 80 targetPort: <MCP Server PORT> type: ClusterIP
Make your service accessible from outside
Create a
ingress.yaml
file. Define an ingress that uses your Kubernetes Cluster’s ingressClassName.If you’re using nginx ingress, it will look like this:
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: mcp-server-ingress namespace: mcp-server annotations: nginx.ingress.kubernetes.io/rewrite-target: / nginx.ingress.kubernetes.io/ssl-redirect: "true" cert-manager.io/cluster-issuer: "{{ cluster_issuer }}" spec: rules: - host: "mcp-server.{{ main_domain }}" http: paths: - backend: service: name: mcp-server port: number: "<MCP Server PORT>" path: / pathType: Prefix
I strongly recommend to use an OAuth2 proxy setup to add an auth wall and prevent your MCP server from being spammed. That is if you’ve not already implemented OAuth2 in your MCP server.
→ If you already have an auth & authorization server, just change your ingress annotations like so:
nginx.ingress.kubernetes.io/auth-method: 'GET' nginx.ingress.kubernetes.io/auth-url: "{{ auth_url }}" nginx.ingress.kubernetes.io/auth-signin: "{{ auth_signin }}" nginx.ingress.kubernetes.io/auth-response-headers: 'Remote-User,Remote-Name,Remote-Groups,Remote-Email'
Reference your identity provider service. If using Authelia:
Managing API Keys Securely
If credentials are needed, store them using Kubernetes secrets in secret.yml
:
apiVersion: v1
kind: Secret
metadata:
name: mcp-server-secrets
type: Opaque
data:
API_KEY: <base64-encoded-api-key>
DATABASE_URL: <base64-encoded-db-url>
Apply manifests
To create your Kubernetes resources effectively:
Make sure you can access to your Kubernetes cluster using kubectl.
All your resources will be in an isolated namespace, which you can create like this:
kubectl create namespace mcp-server
Apply all manifest files:
kubectl apply -f ./ -n mcp-server # or apply them one by one, e.g: # kubectl apply -f deployment.yaml -n mcp-server
Your MCP server is now online on the
mcp-server.{{ main_domain }}
endpoint. Configure your IDE to use it:// claude_desktop_config.json { "mcpServers": { "my_custom_mcp": { "command": "npx", "args": ["mcp-remote","https://mcp-server.<YOUR_DOMAIN>"] } } }
Your team can now connect to your MCP Server remotely!
6. Packaging as Libraries
Python Package
- Create an
__init__.py
module file:
# src/my_custom_mcp/__init__.py
from .server import mcp
Define your package metadata in
pyproject.toml
Build and publish:
pip install build
python -m build
pip install twine
twine upload dist/*
npm Package
Your entrypoint is defined in
index.js
(ordist/index.js
if using TypeScript)Define package metadata in
package.json
Build & publish:
npm run build
npm login
npm publish
Conclusion & Recap
You now have the foundation to deploy MCP servers that are secure, scalable, and production-ready. Here’s a quick recap:
Deployment options: Local development, containerized services, and Kubernetes orchestration
Security best practices: Protect data and systems on local and remote servers
Containerization: Isolate and manage MCP server resources effectively
Monitoring & tracing: Health endpoints, observability tools, and audit trails
AI-agent readiness: Design MCP tools that can be leveraged by intelligent agents
Next steps in the MCP ecosystem:
Optimize performance for high-throughput servers
Explore multi-region and multi-tenant deployments
Implement advanced security measures and access controls
Set up MCP proxies and gateways for centralized management and team collaboration
If you deployed MCP servers infrastructures, how did you manage to do it, and were your MCP servers effectively adopted by your teams?
Start deploying, experimenting, and sharing your results today — every improvement helps shape the future of connected AI systems.
Sources
Subscribe to my newsletter
Read articles from Antonin Marxer directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Antonin Marxer
Antonin Marxer
I'm a fullstack freelance developer & DevOps engineer based in Strasbourg, France. I share blogs that spark joy and curiosity on React, Cloud, Serverless, meta-frameworks and AI + Model Context Protocols Graduate of INSA Lyon engineer school ❤️