python-sdk icon indicating copy to clipboard operation
python-sdk copied to clipboard

Add pluggable instrumentation interface and request_id logging

Open dgenio opened this issue 1 month ago • 1 comments

Summary

This PR introduces a pluggable instrumentation interface with a token-based API for the MCP Python SDK, enabling OpenTelemetry and other observability integrations as requested in #421.

Key Innovation: Token-Based API

The instrumentation interface uses a token-based approach that solves a critical design problem:

class Instrumenter(Protocol):
    def on_request_start(...) -> Any:  # Returns a token
        """Start instrumentation, return state token (e.g., OTel span)"""
    
    def on_request_end(token: Any, ...) -> None:  # Receives the token
        """End instrumentation using the token"""
    
    def on_error(token: Any, ...) -> None:  # Receives the token
        """Handle errors using the token"""

This design enables instrumenters to maintain state (like OpenTelemetry spans) without external storage or side-channels, addressing feedback from the community on API design best practices.

Changes

Core Interface

  • Defined Instrumenter protocol with token-based hooks
  • Created NoOpInstrumenter as default implementation with minimal overhead
  • Token can be any value (span object, dict, etc.)

Integration

  • Added instrumenter parameter to ServerSession and ClientSession constructors
  • Wired instrumentation into Server._handle_request() to track:
    • Request start/end with duration tracking
    • Success/failure status
    • Error occurrences with token propagation
  • Added request_id to logging extra fields for correlation

OpenTelemetry Example

  • NEW: Complete OpenTelemetryInstrumenter implementation in examples/opentelemetry_instrumentation.py
  • Demonstrates span management using tokens
  • Includes setup code and runnable example
  • Shows proper error recording and status codes

Testing

  • Comprehensive tests verifying:
    • Token flow from start → end/error
    • Hooks are invoked for successful and failed requests
    • request_id is consistent across lifecycle
    • Metadata is passed correctly
    • Default no-op behavior works

Documentation

  • Added docs/instrumentation.md with:
    • Token-based API explanation with "Why Tokens?" section
    • Complete OpenTelemetry integration guide
    • Usage examples for server and client
    • Custom metrics example
    • Best practices and migration guide

Benefits

  1. No External Storage: Instrumenters don't need spans = {} dictionaries
  2. OpenTelemetry Compatible: Spans can be returned and passed directly
  3. Thread-Safe: Each request gets its own token
  4. Automatic Cleanup: Tokens are garbage collected
  5. Flexible: Token can be any value

Follow-up Work

  • Package OpenTelemetry instrumenter as installable extra (pip install mcp[opentelemetry])
  • Additional built-in instrumenters (Prometheus, StatsD, Datadog)
  • Distributed tracing via params._meta.traceparent propagation
  • Client-side instrumentation (server-side is complete)

Fixes #421

dgenio avatar Nov 28 '25 12:11 dgenio

Updated the instrumentation interface to use a token-based API based on community feedback. This enables proper OpenTelemetry integration without external storage. See updated PR description for details.

dgenio avatar Nov 28 '25 14:11 dgenio