MCP Server Monitoring Via Prometheus & Grafana

Gato_MaloGato_Malo
6 min read

Why Monitor MCP Servers?

MCP servers often handle critical AI workloads, and understanding their behavior is essential for:

  • Performance Optimization: Identifying bottlenecks and resource constraints

  • Reliability Assurance: Detecting failures and anomalies before they impact users

  • Capacity Planning: Understanding usage patterns to scale appropriately

  • Debugging: Gaining insights into tool execution patterns and failures

Architecture Overview

Our monitoring solution consists of three main components:

  1. MCP Server with Custom Metrics: A FastAPI-based server that exposes MCP tools with integrated Prometheus metrics

  2. Prometheus: Time-series database for metrics collection and storage

  3. Grafana: Visualization platform for creating dashboards and alerts

How is Monitoring MCP Servers Different from Traditional API Servers?

A key difference is that MCP servers typically expose only a single gateway endpoint (e.g., /mcp). All tool invocations are routed through this endpoint, making it impossible to distinguish which specific tool was called using standard API monitoring techniques. In contrast, traditional API servers have separate endpoints for each operation, allowing easy tracking of request types and usage patterns.

To monitor individual tool usage in MCP servers, you must implement custom middlewares or decorators for each tool. These components can:

  • Intercept tool calls

  • Record metrics for each tool invocation

  • Attach tool-specific labels to Prometheus metrics

Without these customizations, you only see aggregate traffic to the /mcp endpoint, losing visibility into the performance and reliability of individual tools. This makes MCP monitoring more challenging and requires a tailored approach for meaningful observability.

Implementation Deep Dive

1. Setting Up the MCP Server

The foundation of our monitoring solution is a FastAPI server that hosts MCP tools. Here's how we've structured it:

from fastapi import FastAPI
from prometheus_fastapi_instrumentator import Instrumentator
from prometheus_client import Counter, Gauge, Histogram

app = FastAPI(title="Demo Prometheus for Cat", version="0.0.1")

# Initialize Prometheus instrumentor
instrumentor = Instrumentator()
instrumentor.instrument(app).expose(app)

2. Custom Metrics for MCP Tools

One of the most innovative aspects of this implementation is the custom metrics system for MCP tools. Let's examine a practical example with our "evil cat" tools:

@mcp.tool(name="sharpen_claw", description="For sharpening evil cat's claw to a new level")
def sharp_claw(level: int) -> str:
    # Access tool-specific metrics directly
    claw_level_metric = metrics_registry.get_tool_metric('mcp_claw_level_total')
    if claw_level_metric:
        claw_level_metric.observe(level)

    return f"claw sharpened to level: {level}"

@mcp.tool(name="scratch_sofa", description="For testing evil cat's claw to ruin leather")
def start_scratch_sofa(claw_level: int) -> str:
    sofa_level = 1000000000
    sofa_hardness = random.randint(1, 10000)

    if sofa_hardness > claw_level:
        failure_metric = metrics_registry.get_tool_metric('mcp_sofa_scratch_failure_total')
        if failure_metric:
            failure_metric.inc()
        return f"your claws are not sharp enough"

    success_metric = metrics_registry.get_tool_metric('mcp_sofa_scratch_success_total')
    if success_metric:
        success_metric.inc()
    return f"sofa is completely ruined. Evil Cat now Laughs"

3. Metrics Registry Pattern

The metrics registry pattern is crucial for managing tool-specific metrics. This approach allows:

  • Centralized Metric Management: All metrics are registered and managed from a single location

  • Dynamic Metric Access: Tools can access their specific metrics without tight coupling

  • Scalability: Easy to add new metrics for new tools without modifying existing code

4. Comprehensive Metric Types

Our implementation leverages all four Prometheus metric types:

Counters

http_requests_total = Counter(
    'http_requests_total',
    'Total HTTP requests',
    ['method', 'handler', 'status']
)

Gauges

active_connections = Gauge(
    'active_connections',
    'Number of active connections'
)

Histograms

http_request_duration_seconds = Histogram(
    'http_request_duration_seconds',
    'HTTP request duration in seconds',
    ['method', 'handler', 'status']
)

Summaries

processing_time = Summary(
    'processing_time_seconds',
    'Time spent processing requests',
    ['operation']
)

Docker Compose Configuration

The entire stack is containerized for easy deployment:

version: '3.8'
services:
  app:
    build: .
    ports:
      - "8000:8000"

  prometheus:
    image: prom/prometheus
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yaml:/etc/prometheus/prometheus.yml

  grafana:
    image: grafana/grafana
    ports:
      - "3000:3000"
    volumes:
      - ./grafana/provisioning:/etc/grafana/provisioning

Grafana Dashboard Configuration

The project includes pre-configured Grafana dashboards that provide:

  1. FastAPI Dashboard: Standard HTTP metrics, request rates, and response times

  2. MCP Tools Dashboard: Custom metrics specific to MCP tool execution

Key Dashboard Features:

  • Real-time Metrics: Live updates of tool execution statistics

  • Success/Failure Rates: Visual representation of tool reliability

  • Performance Trends: Historical data for capacity planning

  • Alert Integration: Configurable alerts for anomaly detection

Benefits of This Approach

1. Granular Observability

Every MCP tool can have its own set of metrics, providing detailed insights into individual tool performance.

2. Standardized Monitoring

Using Prometheus ensures compatibility with the broader monitoring ecosystem and provides a standard query language (PromQL).

3. Visual Analytics

Grafana dashboards make it easy to spot trends, anomalies, and performance issues at a glance.

4. Operational Excellence

The combination of metrics, dashboards, and alerts enables proactive monitoring and quick incident response.

Best Practices and Lessons Learned

1. Metric Naming Convention

Follow Prometheus naming conventions:

  • Use descriptive names with units (e.g., duration_seconds)

  • Include relevant labels for filtering and grouping

  • Use consistent prefixes for related metrics

2. Label Cardinality

Be mindful of label cardinality to avoid performance issues:

  • Avoid high-cardinality labels (e.g., user IDs, timestamps)

  • Use meaningful labels that aid in analysis

3. Dashboard Design

Create focused dashboards:

  • Separate operational metrics from business metrics

  • Use appropriate visualization types for different metric types

  • Include both high-level overviews and detailed drill-down views

Future Enhancements

1. Advanced Alerting

Implement sophisticated alerting rules based on:

  • Tool failure rates

  • Performance degradation

  • Resource utilization thresholds

2. Distributed Tracing

Add distributed tracing with tools like Jaeger or Zipkin to understand request flows across services.

3. Log Correlation

Integrate structured logging with metrics for comprehensive observability.

4. Custom Exporters

Develop custom Prometheus exporters for specific MCP tool metrics that can't be captured through the standard FastAPI instrumentation.

Conclusion

Monitoring MCP servers with Prometheus and Grafana provides a robust foundation for observability in AI/ML infrastructure. This implementation demonstrates how to:

  • Integrate custom metrics into MCP tools

  • Use a metrics registry pattern for scalable metric management

  • Create comprehensive dashboards for operational visibility

  • Deploy the entire stack using Docker Compose

The combination of tool-specific metrics, standardized monitoring, and visual analytics creates a powerful observability solution that scales with your MCP infrastructure.

Whether you're running a single MCP server or a distributed system of AI tools, this monitoring approach will provide the insights needed to maintain reliable, performant services.

Getting Started

To implement this monitoring solution in your own MCP server:

  1. Clone the repository and review the implementation

  2. Customize the metrics for your specific MCP tools

  3. Adapt the Grafana dashboards to your monitoring needs

  4. Deploy using Docker Compose

  5. Start monitoring and optimizing your MCP infrastructure

The future of AI infrastructure depends on robust observability. With this monitoring solution, you're well-equipped to understand, optimize, and scale your MCP servers effectively.


This implementation showcases the power of combining modern monitoring tools with MCP servers to create a comprehensive observability solution. The code is available for exploration and customization to fit your specific monitoring requirements.

0
Subscribe to my newsletter

Read articles from Gato_Malo directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Gato_Malo
Gato_Malo