MCP Server Monitoring Via Prometheus & Grafana

Table of contents
- Why Monitor MCP Servers?
- Architecture Overview
- How is Monitoring MCP Servers Different from Traditional API Servers?
- Implementation Deep Dive
- Docker Compose Configuration
- Grafana Dashboard Configuration
- Benefits of This Approach
- Best Practices and Lessons Learned
- Future Enhancements
- Conclusion
- Getting Started

Why Monitor MCP Servers?
MCP servers often handle critical AI workloads, and understanding their behavior is essential for:
Performance Optimization: Identifying bottlenecks and resource constraints
Reliability Assurance: Detecting failures and anomalies before they impact users
Capacity Planning: Understanding usage patterns to scale appropriately
Debugging: Gaining insights into tool execution patterns and failures
Architecture Overview
Our monitoring solution consists of three main components:
MCP Server with Custom Metrics: A FastAPI-based server that exposes MCP tools with integrated Prometheus metrics
Prometheus: Time-series database for metrics collection and storage
Grafana: Visualization platform for creating dashboards and alerts
How is Monitoring MCP Servers Different from Traditional API Servers?
A key difference is that MCP servers typically expose only a single gateway endpoint (e.g., /mcp
). All tool invocations are routed through this endpoint, making it impossible to distinguish which specific tool was called using standard API monitoring techniques. In contrast, traditional API servers have separate endpoints for each operation, allowing easy tracking of request types and usage patterns.
To monitor individual tool usage in MCP servers, you must implement custom middlewares or decorators for each tool. These components can:
Intercept tool calls
Record metrics for each tool invocation
Attach tool-specific labels to Prometheus metrics
Without these customizations, you only see aggregate traffic to the /mcp
endpoint, losing visibility into the performance and reliability of individual tools. This makes MCP monitoring more challenging and requires a tailored approach for meaningful observability.
Implementation Deep Dive
1. Setting Up the MCP Server
The foundation of our monitoring solution is a FastAPI server that hosts MCP tools. Here's how we've structured it:
from fastapi import FastAPI
from prometheus_fastapi_instrumentator import Instrumentator
from prometheus_client import Counter, Gauge, Histogram
app = FastAPI(title="Demo Prometheus for Cat", version="0.0.1")
# Initialize Prometheus instrumentor
instrumentor = Instrumentator()
instrumentor.instrument(app).expose(app)
2. Custom Metrics for MCP Tools
One of the most innovative aspects of this implementation is the custom metrics system for MCP tools. Let's examine a practical example with our "evil cat" tools:
@mcp.tool(name="sharpen_claw", description="For sharpening evil cat's claw to a new level")
def sharp_claw(level: int) -> str:
# Access tool-specific metrics directly
claw_level_metric = metrics_registry.get_tool_metric('mcp_claw_level_total')
if claw_level_metric:
claw_level_metric.observe(level)
return f"claw sharpened to level: {level}"
@mcp.tool(name="scratch_sofa", description="For testing evil cat's claw to ruin leather")
def start_scratch_sofa(claw_level: int) -> str:
sofa_level = 1000000000
sofa_hardness = random.randint(1, 10000)
if sofa_hardness > claw_level:
failure_metric = metrics_registry.get_tool_metric('mcp_sofa_scratch_failure_total')
if failure_metric:
failure_metric.inc()
return f"your claws are not sharp enough"
success_metric = metrics_registry.get_tool_metric('mcp_sofa_scratch_success_total')
if success_metric:
success_metric.inc()
return f"sofa is completely ruined. Evil Cat now Laughs"
3. Metrics Registry Pattern
The metrics registry pattern is crucial for managing tool-specific metrics. This approach allows:
Centralized Metric Management: All metrics are registered and managed from a single location
Dynamic Metric Access: Tools can access their specific metrics without tight coupling
Scalability: Easy to add new metrics for new tools without modifying existing code
4. Comprehensive Metric Types
Our implementation leverages all four Prometheus metric types:
Counters
http_requests_total = Counter(
'http_requests_total',
'Total HTTP requests',
['method', 'handler', 'status']
)
Gauges
active_connections = Gauge(
'active_connections',
'Number of active connections'
)
Histograms
http_request_duration_seconds = Histogram(
'http_request_duration_seconds',
'HTTP request duration in seconds',
['method', 'handler', 'status']
)
Summaries
processing_time = Summary(
'processing_time_seconds',
'Time spent processing requests',
['operation']
)
Docker Compose Configuration
The entire stack is containerized for easy deployment:
version: '3.8'
services:
app:
build: .
ports:
- "8000:8000"
prometheus:
image: prom/prometheus
ports:
- "9090:9090"
volumes:
- ./prometheus.yaml:/etc/prometheus/prometheus.yml
grafana:
image: grafana/grafana
ports:
- "3000:3000"
volumes:
- ./grafana/provisioning:/etc/grafana/provisioning
Grafana Dashboard Configuration
The project includes pre-configured Grafana dashboards that provide:
FastAPI Dashboard: Standard HTTP metrics, request rates, and response times
MCP Tools Dashboard: Custom metrics specific to MCP tool execution
Key Dashboard Features:
Real-time Metrics: Live updates of tool execution statistics
Success/Failure Rates: Visual representation of tool reliability
Performance Trends: Historical data for capacity planning
Alert Integration: Configurable alerts for anomaly detection
Benefits of This Approach
1. Granular Observability
Every MCP tool can have its own set of metrics, providing detailed insights into individual tool performance.
2. Standardized Monitoring
Using Prometheus ensures compatibility with the broader monitoring ecosystem and provides a standard query language (PromQL).
3. Visual Analytics
Grafana dashboards make it easy to spot trends, anomalies, and performance issues at a glance.
4. Operational Excellence
The combination of metrics, dashboards, and alerts enables proactive monitoring and quick incident response.
Best Practices and Lessons Learned
1. Metric Naming Convention
Follow Prometheus naming conventions:
Use descriptive names with units (e.g.,
duration_seconds
)Include relevant labels for filtering and grouping
Use consistent prefixes for related metrics
2. Label Cardinality
Be mindful of label cardinality to avoid performance issues:
Avoid high-cardinality labels (e.g., user IDs, timestamps)
Use meaningful labels that aid in analysis
3. Dashboard Design
Create focused dashboards:
Separate operational metrics from business metrics
Use appropriate visualization types for different metric types
Include both high-level overviews and detailed drill-down views
Future Enhancements
1. Advanced Alerting
Implement sophisticated alerting rules based on:
Tool failure rates
Performance degradation
Resource utilization thresholds
2. Distributed Tracing
Add distributed tracing with tools like Jaeger or Zipkin to understand request flows across services.
3. Log Correlation
Integrate structured logging with metrics for comprehensive observability.
4. Custom Exporters
Develop custom Prometheus exporters for specific MCP tool metrics that can't be captured through the standard FastAPI instrumentation.
Conclusion
Monitoring MCP servers with Prometheus and Grafana provides a robust foundation for observability in AI/ML infrastructure. This implementation demonstrates how to:
Integrate custom metrics into MCP tools
Use a metrics registry pattern for scalable metric management
Create comprehensive dashboards for operational visibility
Deploy the entire stack using Docker Compose
The combination of tool-specific metrics, standardized monitoring, and visual analytics creates a powerful observability solution that scales with your MCP infrastructure.
Whether you're running a single MCP server or a distributed system of AI tools, this monitoring approach will provide the insights needed to maintain reliable, performant services.
Getting Started
To implement this monitoring solution in your own MCP server:
Clone the repository and review the implementation
Customize the metrics for your specific MCP tools
Adapt the Grafana dashboards to your monitoring needs
Deploy using Docker Compose
Start monitoring and optimizing your MCP infrastructure
The future of AI infrastructure depends on robust observability. With this monitoring solution, you're well-equipped to understand, optimize, and scale your MCP servers effectively.
This implementation showcases the power of combining modern monitoring tools with MCP servers to create a comprehensive observability solution. The code is available for exploration and customization to fit your specific monitoring requirements.
Subscribe to my newsletter
Read articles from Gato_Malo directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
