[feature request] MCP sampling propagation
Is your feature request related to a problem? Please describe.
Right now, our MCP propagation handles client to server propagation of span IDs, which allows MCP client activity to be in the same trace as the server that responds to it.
https://github.com/Arize-ai/openinference/tree/main/js/packages/openinference-instrumentation-mcp https://github.com/Arize-ai/openinference/tree/main/python/instrumentation/openinference-instrumentation-mcp
sequenceDiagram
participant Client
participant Server
Note over Client: Start root span (Span A)
Client->>Server: Initiate tool request (with trace context of Span A)
Note over Server: Start child span (Span B, child of A)
Server->>Client: Send tool response
Note over Server: End Span B
Note over Client: End Span A
This suits the vast majority of use cases, but it doesn't yet handle the opposite direction. There's a feature of MCP called sampling when the server asks the client to do an LLM completion. When this occurs, the client activity will show up in a new trace, as propagation isn't applied backwards in this case.
sequenceDiagram
participant Client
participant Server
Note over Client: Start Span A (Trace 1)
Client->>Server: Initiate tool request (with trace context of Span A)
Note over Server: Start Span B (Trace 1, child of Span A)
Server->>Client: Request sampling (without trace context)
Note over Client: Start Span C (Trace 2, new trace)
Note over Client: Start Span D (Trace 2, child of Span C) for LLM completion
Note over Client: End Span D (Trace 2)
Client->>Server: Send sampling result
Note over Client: End Span C (Trace 2)
Note over Server: Continue in Span B (Trace 1)
Server->>Client: Send tool response
Note over Server: End Span B (Trace 1)
Note over Client: End Span A (Trace 1)
Describe the solution you'd like
Instead, we want the backwards direction to propagate span IDs like this:
sequenceDiagram
participant Client
participant Server
Note over Client: Start root span (Span A)
Client->>Server: Initiate tool request (with trace context of Span A)
Note over Server: Start child span (Span B, child of A)
Server->>Client: Request sampling (with trace context of Span B)
Note over Client: Start child span (Span C, child of B) for sampling
Note over Client: Start child span (Span D, child of C) for LLM completion
Note over Client: End Span D
Client->>Server: Send sampling result
Note over Client: End Span C
Note over Server: Continue in Span B
Server->>Client: Send tool response
Note over Server: End Span B
Note over Client: End Span A
There may be some glitches, but ideally the protocol is just reversed and applied to the sampling/createMessage messsage as well.
Describe alternatives you've considered
An alternative is to wait until this is more supported. The feature is experimental and isn't used in any obvious way yet.
However, if you look a bit deeper, OpenInference does have support for a call site into MCP sampling, via beeai. https://github.com/Arize-ai/openinference/tree/main/js/packages/openinference-instrumentation-beeai https://github.com/Arize-ai/openinference/tree/main/python/instrumentation/openinference-instrumentation-beeai
beeai has an experimental spec and library for agent communication called ACP. So, we can assume that this will become used and then trigger this use case at some point.
cc @ismaelfaro who is authoring the ACP spec and @pilartomas and @jezekra1 who have authored MCP support for sampling in the ACP sdk.
Additional context
https://github.com/Arize-ai/openinference/pull/1524 began MCP propagation from client to server.
https://modelcontextprotocol.io/docs/concepts/sampling describes the sampling feature
https://docs.beeai.dev/acp/alpha/introduction describes ACP