[feature request] MCP sampling propagation

Open codefromthecrypt opened this issue 7 months ago • 0 comments

Is your feature request related to a problem? Please describe.

Right now, our MCP propagation handles client to server propagation of span IDs, which allows MCP client activity to be in the same trace as the server that responds to it.

https://github.com/Arize-ai/openinference/tree/main/js/packages/openinference-instrumentation-mcp https://github.com/Arize-ai/openinference/tree/main/python/instrumentation/openinference-instrumentation-mcp

sequenceDiagram
    participant Client
    participant Server

    Note over Client: Start root span (Span A)
    Client->>Server: Initiate tool request (with trace context of Span A)
    Note over Server: Start child span (Span B, child of A)
    Server->>Client: Send tool response
    Note over Server: End Span B
    Note over Client: End Span A

This suits the vast majority of use cases, but it doesn't yet handle the opposite direction. There's a feature of MCP called sampling when the server asks the client to do an LLM completion. When this occurs, the client activity will show up in a new trace, as propagation isn't applied backwards in this case.

sequenceDiagram
    participant Client
    participant Server

    Note over Client: Start Span A (Trace 1)
    Client->>Server: Initiate tool request (with trace context of Span A)
    Note over Server: Start Span B (Trace 1, child of Span A)
    Server->>Client: Request sampling (without trace context)
    Note over Client: Start Span C (Trace 2, new trace)
    Note over Client: Start Span D (Trace 2, child of Span C) for LLM completion
    Note over Client: End Span D (Trace 2)
    Client->>Server: Send sampling result
    Note over Client: End Span C (Trace 2)
    Note over Server: Continue in Span B (Trace 1)
    Server->>Client: Send tool response
    Note over Server: End Span B (Trace 1)
    Note over Client: End Span A (Trace 1)

Describe the solution you'd like

Instead, we want the backwards direction to propagate span IDs like this:

sequenceDiagram
    participant Client
    participant Server

    Note over Client: Start root span (Span A)
    Client->>Server: Initiate tool request (with trace context of Span A)
    Note over Server: Start child span (Span B, child of A)
    Server->>Client: Request sampling (with trace context of Span B)
    Note over Client: Start child span (Span C, child of B) for sampling
    Note over Client: Start child span (Span D, child of C) for LLM completion
    Note over Client: End Span D
    Client->>Server: Send sampling result
    Note over Client: End Span C
    Note over Server: Continue in Span B
    Server->>Client: Send tool response
    Note over Server: End Span B
    Note over Client: End Span A

There may be some glitches, but ideally the protocol is just reversed and applied to the sampling/createMessage messsage as well.

Describe alternatives you've considered

An alternative is to wait until this is more supported. The feature is experimental and isn't used in any obvious way yet.

However, if you look a bit deeper, OpenInference does have support for a call site into MCP sampling, via beeai. https://github.com/Arize-ai/openinference/tree/main/js/packages/openinference-instrumentation-beeai https://github.com/Arize-ai/openinference/tree/main/python/instrumentation/openinference-instrumentation-beeai

beeai has an experimental spec and library for agent communication called ACP. So, we can assume that this will become used and then trigger this use case at some point.

cc @ismaelfaro who is authoring the ACP spec and @pilartomas and @jezekra1 who have authored MCP support for sampling in the ACP sdk.

Additional context

https://github.com/Arize-ai/openinference/pull/1524 began MCP propagation from client to server.

https://modelcontextprotocol.io/docs/concepts/sampling describes the sampling feature

https://docs.beeai.dev/acp/alpha/introduction describes ACP

Apr 21 '25 03:04 codefromthecrypt