feat: tidy otel and implement http and MCP propagation
Summary
This improves the OpenTelemetry implementation such that it correctly creates distributed traces.
This focuses on trace propagation, both to LLM providers (HTTP headers) and MCP servers (request metadata).
This also prefers standard otel environment configuration and defaults instead of hard-coding, allowing drop-in usage with the same ENV as python or other language agents.
Type of Change
- [x] Feature
- [ ] Bug fix
- [x] Refactor / Code quality
- [ ] Performance improvement
- [ ] Documentation
- [x] Tests
- [ ] Security fix
- [ ] Build / Release
- [ ] Other (specify below)
Testing
- Start ollama
OLLAMA_CONTEXT_LENGTH=131072 OLLAMA_HOST=0.0.0.0 ollama serve
- Start otel-tui
docker run --rm -it --name otel-tui -p 4318:4318 ymtdzzz/otel-tui:latest
- Start envoy ai gateway to proxy ollama and kiwi
$ docker run --rm --name aigw -p 1975:1975 \
-e OPENAI_BASE_URL=http://host.docker.internal:11434/v1 -e OPENAI_API_KEY=unused \
-e OTEL_EXPORTER_OTLP_ENDPOINT=http://host.docker.internal:4318 -e OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf \
-e OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE=delta \
-e OTEL_METRIC_EXPORT_INTERVAL=100 -e OTEL_BSP_SCHEDULE_DELAY=100 \
envoyproxy/ai-gateway-cli run --mcp-json '{"mcpServers":{"kiwi":{"type":"http","url":"https://mcp.kiwi.com"}}}'
- Save this as kiwi_recipe.yaml
prompt: |
Use the search-flight tool to search for flights from New York to Los Angeles on {{flight_date}}.
When specifying the departureDate parameter, use the format dd/mm/yyyy.
Display the first 3 results with the final_output tool in a specific JSON format.
The output is in the following format where price includes the currency symbol:
{
"contents": [
{
"airline": "...",
"flight_number": "...",
"price": "..."
},
...
]
}
ANY FLIGHT IS FINE. DO NOT CONSIDER ANY USER PREFERENCES.*
extensions:
- name: mcp_gateway type: streamable_http uri: http://127.0.0.1:1975/mcp
parameters:
- key: flight_date input_type: string requirement: required description: flight date in DD/MM/YYYY format default: "31/12/2025"
settings:
Remove this if using GPT-5 series (e.g., gpt-5, gpt-5-mini, gpt-5-nano)
temperature: 0.0
response: json_schema: type: object additionalProperties: false required: - contents properties: contents: type: array description: "First 3 flight results in a consistent, parseable structure." minItems: 1 maxItems: 3 items: type: object additionalProperties: false required: - airline - flight_number - price properties: airline: type: string description: "Airline name" flight_number: type: string description: "Flight number" price: type: string description: "Price with currency symbol"
- Run goose pointing to above!
OPENAI_HOST=http://127.0.0.1:1975 OPENAI_API_KEY=test-key \
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf \
OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE=delta \
OTEL_METRIC_EXPORT_INTERVAL=100 OTEL_BSP_SCHEDULE_DELAY=100 \
cargo run --bin goose -- run --provider openai --model qwen3:1.7b --recipe kiwi_recipe.yaml --params flight_date=31/12/2025
Related Issues
#75
Screenshots/Demos (for UX changes)
On the first pass this seems as a nice touch up. Thank you! I will need to run this through with our internal OTEL client as some changes might cause metric drop ( particularly provider flushing) Will post an official review after that
@Kvadratni thanks and feedback welcome. I'm quite new at rust so any suggestions are welcome!
putting this back in draft because the integration test isn't working. I will be out next week so bear this in mind. happy to get feedback meanwhile if you like, or you can wait until I pull it out of draft again
Thank you so much for working on this. Just wanted to make sure - have you linked to the proper related issue? Just so it closes or updates what is related <3
@taniandjerry thanks for the feedback, so this topic is fairly ancient back to the python exchange days and most recently #203 and the polishing is a follow up to my old PR in python #75. Would it help for me to also make an issue which captures the motivation parts of the PR description then link this back to it?
This PR is unrelated to hacktoberfest. @taniandjerry
No worries on issue number @codefromthecrypt !!
cool will retouch this up once https://github.com/block/goose/pull/5165 merges so I make sure I get my code fashion together ;)
@codefromthecrypt I'd love to have a completed version of this one. Do you have bandwidth to update it?
sorry yeah.. I got distracted on a side-quest about the claude provider and MCP based approval integration with goose. lemme refocus on this now
yesterday escaped me.. doing it now
update: I'm working hard on this, I need to stop for the day. I don't want to raise this again without 100pct verified integration test approach for validating propagtion. This is somewhat new ground as the only thing similar is the MCP replay tests. I prefer to delay another day to do this right than not, so hopefully understandable.
@codefromthecrypt of course - let us know whenever ready for a review! It's a big feature and will be huge to get it right
I'm sorry, but I'm unable to complete this PR and have to step back from it. The code affected by instrumentation has diverged from main since I started, and my initial understanding was already limited, making it hard to catch up.
I've struggled to implement robust tests that would prevent future drift. My limited Rust experience is a big factor here—this section involves intricate instrumentation across multiple components, without sufficient end-to-end checks to ensure spans are properly initiated, terminated, and parented. While I've built similar features in Java, Go, and Python (where CLI testing feels more flexible), the setup in this project has proven challenging for me. I've also tried using agents to assist, but they haven't been effective for this level of complexity.
Ultimately, this has consumed more time than anticipated, leading to frustration and delaying other priorities. With upcoming travel, I don't foresee gaining the necessary proficiency in the next few weeks. I believe this would be better handled by someone more familiar with the project's quirks, stronger in Rust, and able to introduce a new fixture for comprehensive end-to-end side-effect testing—such as running the Goose CLI and/or Playwright with OTEL environment variables against mocked OpenAI and MCP services, then verifying both traces and propagated data. I'd be happy to sketch more what that would look like, but I have to call bankruptcy on salvaging this PR.
PS I had an LLM reverse engineer a guide from my local changes. hope it helps the next person (and doens't have a hallucination I missed!)
Async + OpenTelemetry
Libraries in Use
Async Runtime
- tokio - Async runtime
- async_stream - Macro for creating streams (
try_stream!) - futures - Stream traits and combinators
Tracing/OpenTelemetry
- tracing - Structured logging and tracing
- tracing-opentelemetry - Bridge between tracing and OpenTelemetry
- opentelemetry - Core OpenTelemetry API
- opentelemetry-sdk - OpenTelemetry SDK implementation
How OpenTelemetry Span Context Works
Thread-Local Storage Mechanism
OpenTelemetry in Rust uses thread-local storage via tracing's subscriber system to track the "current span":
// tracing internally uses thread_local!
thread_local! {
static CURRENT_SPAN: RefCell<Option<SpanContext>>;
}
The Critical Problem with Async
Thread-local storage does NOT propagate across .await points.
When an async task yields at .await:
- The future may resume on a different thread
- Thread-local span context is lost
- Child operations have no parent span
The Bridge: tracing-opentelemetry
The tracing-opentelemetry crate bridges Rust's tracing ecosystem with OpenTelemetry:
tracing (Rust ecosystem)
↓ tracing-subscriber Layer system
tracing-opentelemetry (bridge crate)
├─ Implements tracing::Layer
├─ Stores OpenTelemetry context in tracing span
└─ Provides extension traits to access it
↓
OpenTelemetrySpanExt::context() - extract OTel context from tracing span
TraceContextExt::span() - extract OTel span from context
↓
opentelemetry-sdk (OTel implementation)
SpanContext { trace_id, span_id, trace_flags }
Key insight: There are two different span types:
tracing::Span- Rust's structured logging/tracingopentelemetry::trace::Span- OpenTelemetry's distributed tracing
They're connected via the tracing-opentelemetry bridge layer.
Alternative: Task-Local Storage
tokio::task_local! does propagate across .await points within the same task:
task_local! {
pub static SESSION_ID: Option<String>;
}
Used for session_id propagation (see commit 0515c76e84806e0d328e62fa1f5b5ff90475908d).
Correct Instrumentation Patterns
Pattern 1: Regular Async Functions with #[instrument]
Use #[instrument] macro for regular async functions:
#[instrument(name = "operation", skip(param1, param2), fields(session.id = %session_id))]
async fn operation(param1: String, param2: &Data, session_id: &str) -> Result<Output> {
// Span is active for entire function execution
// Propagates correctly across all .await points
let result = child_operation().await?;
Ok(result)
}
How it works:
#[instrument]usestracing::Instrumenttrait- Creates span when function is called
- Span remains active across all
.awaitpoints - Ends when function completes
When to use:
- Standard async functions where the function name matches desired span name
- When you need simple span attributes via the
fields()parameter - Most common case for instrumentation
Pattern 2: Manual Span Creation
Create spans manually when you need fine-grained control:
fn create_chat_span(model_name: &str) -> tracing::Span {
tracing::info_span!(
"chat", // Span name for tracing
otel.name = format!("chat {}", model_name), // OTel display name
gen_ai.request.model = %model_name,
gen_ai.system = "openai",
gen_ai.operation.name = "chat"
)
}
async fn complete(...) -> Result<Message> {
let span = create_chat_span(&model_name);
let _guard = span.enter();
// All work happens under this span
let response = self.client.post(url).send().await?;
Ok(response)
}
When to use instead of #[instrument]:
- Need custom OpenTelemetry attributes (like
otel.name) - Following OpenTelemetry semantic conventions (e.g., GenAI)
- Want different span name than function name
- Need to construct span attributes dynamically
Why manual creation:
#[instrument]doesn't support theotel.nameattribute- GenAI semantic conventions require specific span naming:
"{operation} {model}" - Need precise control over span attributes beyond what
#[instrument]provides
Pattern 3: Spawned Tasks
Use .in_current_span() or .instrument(span) for tokio::spawn:
// Variant 1: Capture current span
let handle = tokio::spawn(
async move {
// work happens here with current span as parent
}.in_current_span()
);
// Variant 2: Propagate specific span
let span = tracing::Span::current();
let handle = tokio::spawn(
async move {
// work happens here with captured span as parent
}.instrument(span)
);
How it works:
.in_current_span()captures the current span at call time.instrument(span)uses a specific span you provide- The captured span becomes active in the spawned task
- All operations in spawned task nest under captured span
When to use each:
.in_current_span(): When you want the current span.instrument(span): When you need a specific span (more flexible)
Pattern 4: Lazy Streams (async_stream::try_stream!)
Capture span before stream creation, enter when stream executes:
async fn create_stream(...) -> Result<BoxStream<'_, Result<Item>>> {
// 1. Capture current span before creating lazy stream
let span = tracing::Span::current();
Ok(Box::pin(async_stream::try_stream! {
// 2. Enter the span when stream executes
// IMPORTANT: Use underscore-prefixed name to keep guard alive
let _guard = span.enter();
// 3. All operations now have correct parent
loop {
let item = do_work().await?;
yield item;
}
// Guard drops here, exiting span
}))
}
CRITICAL: Span Guard Lifetime
// WRONG - guard drops immediately, span not active
let _ = span.enter();
// CORRECT - guard lives until end of scope
let _guard = span.enter();
let _span_guard = span.enter(); // Also correct
The underscore prefix (_guard) suppresses unused variable warnings while keeping the guard alive for the entire scope.
Why this works:
async_stream::try_stream!creates a lazy stream- The stream is not executing when the function returns
#[instrument]on the function would end when it returns- Must capture span while it's active, enter it when stream executes
- The guard must stay alive to keep span active
Why NOT use .in_current_span():
.in_current_span()is forFuturetypesasync_stream::try_stream!is a macro, not a future- Cannot apply
.in_current_span()to macro expansion
OpenTelemetry Span Hierarchy
Span Context Components
Each span has:
trace_id(128-bit, 32 hex chars) - identifies the entire distributed tracespan_id(64-bit, 16 hex chars) - identifies this specific spanparent_span_id(64-bit, 16 hex chars) - identifies parent spantrace_flags(8-bit, 2 hex chars) - sampling and other flags
Parent-Child Relationships
When creating a span, OpenTelemetry:
- Reads "current span" from thread-local storage
- Uses its
span_idas this span'sparent_span_id - Inherits the
trace_idfrom parent - Generates a new unique
span_idfor this span
Validation Rules
A span context is valid only if:
trace_id != 0x00000000000000000000000000000000(not all zeros)span_id != 0x0000000000000000(not all zeros)
Invalid span contexts:
- Are not propagated across process boundaries
- Result in silent skipping of trace context injection
- Cause remote processes to start new independent traces
Propagation Methods
- Within a process: Via thread-local storage (with instrumentation patterns)
- Across processes (HTTP): Via
traceparentHTTP header (W3C Trace Context) - Across processes (MCP): Via
_meta.traceparentfield in JSON-RPC requests
Cross-Process Trace Context Propagation
Architecture Overview
┌─────────────────────────────────────────────────────────────┐
│ Goose Process (Client) │
│ │
│ ┌──────────────┐ │
│ │ Active Span │ (thread-local) │
│ │ trace_id: T │ │
│ │ span_id: S │ │
│ └──────┬───────┘ │
│ │ │
│ │ Extract & Serialize │
│ ↓ │
│ ┌────────────────────────────────┐ │
│ │ W3C Traceparent String │ │
│ │ "00-T-S-01" │ │
│ └────┬───────────────────┬───────┘ │
│ │ │ │
│ │ │ │
└───────┼───────────────────┼─────────────────────────────────┘
│ │
│ HTTP │ MCP
│ (Header) │ (Meta Field)
↓ ↓
┌──────────────┐ ┌──────────────┐
│ LLM Provider │ │ MCP Server │
│ (Remote) │ │ (Remote) │
└──────────────┘ └──────────────┘
W3C Trace Context Format
traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
│ │ │ │
│ │ │ └─ trace-flags (sampling, etc)
│ │ └──────────────── parent-id (this span's ID)
│ └──────────────────────────────────────────────── trace-id (distributed trace identifier)
└──────────────────────────────────────────────────── version (always 00)
Component Meanings:
- version: Protocol version (currently always
"00") - trace-id: 128-bit globally unique trace identifier
- Same across all processes in the trace
- Identifies the entire distributed transaction
- Format: 32 hexadecimal characters
- parent-id: 64-bit span identifier
- The span ID from the sending process
- Becomes the
parent_span_idin receiving process - Format: 16 hexadecimal characters
- trace-flags: 8-bit flags
- Bit 0: sampled flag (
01= sampled,00= not sampled) - Other bits reserved for future use
- Format: 2 hexadecimal characters
- Bit 0: sampled flag (
State Transformation Pipeline
The span context goes through several transformations when crossing process boundaries:
In-Memory Representation:
SpanContext {
trace_id: u128, // 128-bit integer
span_id: u64, // 64-bit integer
trace_flags: u8 // 8-bit flags
}
↓
Extraction (via OpenTelemetry APIs):
tracing::Span::current() // Get tracing span
.context() // Get OTel context (via OpenTelemetrySpanExt)
.span() // Get OTel span (via TraceContextExt)
.span_context() // Get SpanContext
.is_valid() // Validate non-zero
↓
Serialization:
format!("00-{:032x}-{:016x}-{:02x}",
trace_id, span_id, flags.to_u8())
↓
Transport Encoding:
HTTP: traceparent: {string}
MCP: _meta: { "traceparent": "{string}" }
↓
Wire Format:
HTTP: Plain header in HTTP/1.1 or HTTP/2
MCP: JSON object field in JSON-RPC
HTTP Propagation
Injection Point: The trace context is injected during HTTP request building, after the request is constructed but before it's sent.
ApiRequestBuilder::build()
↓
1. Build base HTTP request
2. Add authentication headers
3. Add session ID header (if present)
4. [TRACE PROPAGATION]
a. Extract current span context
b. Validate span context is_valid()
c. Format as W3C traceparent
d. Inject as "traceparent" header
5. Send request
Example HTTP Request:
POST /v1/chat/completions HTTP/1.1
Host: api.openai.com
Authorization: Bearer sk-...
traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
Content-Type: application/json
{ "model": "gpt-4o", "messages": [...] }
When propagation is skipped:
- No active span (span context is invalid)
- Span context has all-zero trace_id or span_id
- Tracing is not initialized or disabled
- Behavior: Request proceeds without trace context (silent skip)
MCP Propagation
The Extensions Mechanism: MCP uses a type-erased Extensions map that gets serialized as _meta in JSON-RPC.
MCP Request Flow:
ClientRequest::CallToolRequest
↓
Extensions (type-erased map)
↓
inject_session_into_extensions()
├─ Extract existing Meta object (if any)
├─ Create new map or use existing
├─ Add "session_id" field (if available)
└─ Call inject_trace_context()
├─ Get current span
├─ Extract span_context
├─ Validate is_valid()
├─ Format as W3C traceparent
└─ Add "traceparent" field to Meta map
↓
Merged Meta inserted back into Extensions
↓
JSON-RPC serialization adds _meta object
↓
Wire format (JSON)
Example MCP Request:
{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "read_file",
"arguments": { "path": "/tmp/file.txt" }
},
"_meta": {
"traceparent": "00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01",
"session_id": "20251106_42"
}
}
Key insights:
- The
Metatype wraps aHashMap<String, Value> - It's part of MCP's
Extensionssystem - Gets serialized as
_metain JSON-RPC - Multiple metadata can coexist (trace context + session ID + others)
- Propagation merges with existing metadata rather than replacing
Remote Parent Relationships
When trace context crosses a process boundary, the span hierarchy is maintained through ID linkage:
Process A Wire Process B
┌─────────────┐ ┌─────────────┐
│ Span Alpha │ │ │
│ trace: T │ │ │
│ span: AAAA │──────────────────────────────────→│ │
│ │ traceparent: 00-T-AAAA-01 │ Span Beta │
└─────────────┘ │ trace: T │
│ parent: AAAA│
Parent span in Process A │ span: BBBB │
└─────────────┘
Child span in Process B
Critical concept:
- Process A's
span_id(AAAA) becomes Process B'sparent_span_id - Process B generates a new
span_id(BBBB) for its own span - The
trace_id(T) is identical in both processes - This creates a distributed span hierarchy across process boundaries
Receiver Implementation Pattern (Abstract)
When a service receives a request with trace context:
Receiver Process:
1. Extract traceparent from transport:
HTTP: Read "traceparent" header
MCP: Read _meta.traceparent field
2. Parse W3C format:
Split on "-" → [version, trace_id, parent_id, flags]
Validate format (4 components, correct lengths)
Convert hex strings to binary
3. Create remote span context:
RemoteContext {
trace_id: (from traceparent),
parent_id: (from traceparent)
}
4. Create local span as child:
LocalSpan {
trace_id: RemoteContext.trace_id, ← inherited
parent_span_id: RemoteContext.parent_id, ← links to remote parent
span_id: (generate new) ← unique to this span
}
5. Execute operation under span
6. Span hierarchy is now distributed:
Remote Span (AAAA) → Local Span (BBBB)
Connected via parent_span_id = AAAA
Propagation Timing
The trace context is extracted and injected at a specific point in the request lifecycle:
Timeline:
t0: Span created (in-process)
t1: Span entered (becomes "current")
t2: Code initiates outbound request
t3: ← PROPAGATION HAPPENS HERE ←
- Extract current span context from thread-local
- Serialize to W3C traceparent format
- Inject into request (header or meta field)
t4: Request sent over network
t5: Span continues execution (still active)
t6: Response received
t7: Span ends
Critical point: Propagation (t3) happens while the span is still active. The span context is captured at request-build time, not at span-creation time.
Error Scenarios
Scenario 1: No active span
Current span is invalid or doesn't exist
→ Propagation is silently skipped
→ Request has no trace context
→ Remote process starts new trace (if instrumented)
Scenario 2: Invalid span context
Span context has all-zero trace_id or span_id
→ is_valid() returns false
→ Propagation is silently skipped
→ Request proceeds without trace context
Scenario 3: Tracing disabled
Tracing not initialized or disabled
→ No span context available
→ Request proceeds without trace context
Scenario 4: Remote doesn't support tracing
Traceparent sent but remote ignores it
→ No remote spans created
→ Trace ends at process boundary
→ Not an error, just incomplete trace
Scenario 5: Format mismatch
Remote expects different format (not W3C)
→ Trace context not recognized
→ Remote may start independent trace
OpenTelemetry Semantic Conventions
GenAI Semantic Conventions
For LLM operations, we follow the OpenTelemetry GenAI semantic conventions:
Required attributes:
gen_ai.operation.name- Type of operation (e.g., "chat", "embedding")gen_ai.system- The AI system (e.g., "openai", "anthropic")gen_ai.request.model- Model identifier (e.g., "gpt-4o", "claude-sonnet-4")
Span naming:
- Format:
"{operation} {model}" - Examples:
"chat gpt-4o","embedding text-embedding-3-small" - Implemented via
otel.nameattribute (not the tracing span name)
Why manual span creation:
#[instrument]macro doesn't supportotel.nameattribute- Need custom span name format different from function name
- Require specific GenAI attributes not supported by
#[instrument]
Example:
fn create_chat_span(model_name: &str) -> tracing::Span {
tracing::info_span!(
"chat", // Internal tracing name
otel.name = format!("chat {}", model_name), // OTel display name
gen_ai.request.model = %model_name,
gen_ai.system = "openai",
gen_ai.operation.name = "chat"
)
}
Tool Execution Spans
Tool calls create spans with:
- Name:
"gen_ai.tool.call" - Attributes:
tool.name- Name of the tool being called
- Lifetime: Span is kept alive during entire tool execution
Pattern:
fn tool_stream<S, F>(
rx: S,
done: F,
tool_name: Option<String>
) -> ToolStream {
Box::pin(async_stream::stream! {
// Create and enter span if tool name provided
let tool_span = tool_name.as_ref().map(|name| {
tracing::info_span!("gen_ai.tool.call", tool.name = %name)
});
let _span_guard = tool_span.as_ref().map(|span| span.enter());
// Tool execution happens here under span
// ...
// Span ends when guard drops
})
}
CLI Execution Spans
The CLI creates root spans for recipe and run execution:
Recipe execution:
let span = tracing::info_span!(
"recipe.execute",
session.id = %session_id,
recipe.name = %recipe_name
);
let _enter = span.enter();
// Recipe execution happens here
Run execution:
let span = tracing::info_span!(
"run.execute",
session.id = %session_id
);
let _enter = span.enter();
// Run execution happens here
These root spans become the parent for all agent operations, creating a complete trace hierarchy.
Example Patterns
Example 1: LLM Provider Span Pattern
fn create_chat_span(model_name: &str) -> tracing::Span {
tracing::info_span!(
"chat",
otel.name = format!("chat {}", model_name),
gen_ai.request.model = %model_name,
gen_ai.system = "openai",
gen_ai.operation.name = "chat"
)
}
async fn stream(...) -> Result<MessageStream, ProviderError> {
// Create span before lazy stream
let span = create_chat_span(&self.model.model_name);
Ok(Box::pin(try_stream! {
// Enter span when stream executes
let _span_guard = span.enter();
// HTTP request happens here with correct parent span
// The API client will inject traceparent header automatically
let response = self.client.post(url).send().await?;
// Stream response processing
// ...
// Span ends when guard drops (stream completes)
}))
}
Uses the lazy stream pattern with manual span creation for GenAI conventions.
Example 2: Agent Reply Pattern
#[instrument(name = "agent.reply", skip(self, user_message, session_config),
fields(user_message, session.id = %session_config.id))]
pub async fn reply(
&self,
user_message: Message,
session_config: &SessionConfig,
) -> Result<AgentStream> {
// #[instrument] creates span for this function
self.reply_internal(user_message, session_config).await
}
async fn reply_internal(
&self,
user_message: Message,
session_config: &SessionConfig,
) -> Result<BoxStream<'_, Result<AgentEvent>>> {
// Capture span from reply() before creating lazy stream
let reply_span = tracing::Span::current();
Ok(Box::pin(async_stream::try_stream! {
// Enter span when stream executes
let _reply_span_guard = reply_span.enter();
loop {
// LLM calls here have correct parent span (agent.reply)
let mut stream = Self::stream_response_from_provider(...).await?;
// Tool calls here also have correct parent span
let tool_result = self.dispatch_tool_call(...).await;
// All nested operations inherit the agent.reply span as parent
}
}))
}
Two-level pattern:
reply()uses#[instrument]to create named span with attributesreply_internal()captures and enters span in lazy stream- Session naming task is instrumented with same span (if spawned)
Testing Span Hierarchies
Expected Hierarchy Example
run.execute (CLI root span)
└─ agent.reply (from #[instrument])
├─ chat gpt-4o-mini (session naming, if needed)
├─ gen_ai.tool.call (MCP tool)
│ └─ (tool execution in remote MCP server)
└─ chat gpt-4o (main completion)
Conceptual Validation Pattern
When testing distributed traces, verify:
1. Parent-Child Relationships:
// Child's parent_span_id should match parent's span_id
assert_eq!(
child_span.parent_span_id,
parent_span.span_context.span_id(),
"Child should link to parent"
);
2. Shared Trace ID:
// All spans in same trace share trace_id
assert_eq!(
child_span.span_context.trace_id(),
parent_span.span_context.trace_id(),
"All spans must share trace_id"
);
3. Temporal Enclosure:
// Parent must strictly enclose child in time
assert!(parent.start_time <= child.start_time);
assert!(parent.end_time >= child.end_time);
4. Cross-Process Propagation:
// Extract traceparent from request
let traceparent = /* extract from _meta or header */;
// Parse W3C format
let parts: Vec<&str> = traceparent.split('-').collect();
assert_eq!(parts.len(), 4, "Invalid W3C format");
assert_eq!(parts[0], "00", "Invalid version");
assert_eq!(parts[1].len(), 32, "Invalid trace_id length");
assert_eq!(parts[2].len(), 16, "Invalid span_id length");
// Verify span context is valid (not all zeros)
assert_ne!(parts[2], "0000000000000000", "Span ID must be valid");
Testing Infrastructure Patterns (Abstract)
For testing distributed tracing, you need:
1. Trace Collector:
- Receives OTLP (OpenTelemetry Protocol) exports
- Stores spans in memory for inspection
- Runs on ephemeral port for test isolation
2. Fake Servers:
- MCP server that captures
_meta.traceparent - LLM provider that captures
traceparentheader - Both run on ephemeral ports
3. Validation:
- Extract and parse traceparent values
- Collect all exported spans
- Verify hierarchy, IDs, and timing
4. Test Flow:
1. Start OTLP collector
2. Start fake MCP/LLM servers
3. Configure goose to use fake servers
4. Execute operation that should create spans
5. Wait for span export (async)
6. Collect and verify spans
Debugging and Troubleshooting
Common Issues
1. Spans not appearing in traces:
- Check that tracing is initialized before span creation
- Verify OTLP exporter is configured correctly
- Ensure spans are being exported (may have delay)
- Check that sampling is enabled
2. Broken span hierarchy (orphaned spans):
- Verify span is entered before child operations
- Check that lazy streams capture and enter span
- Ensure spawned tasks use
.in_current_span()or.instrument() - Look for missing instrumentation on async functions
3. Missing trace context in remote calls:
- Verify span context
is_valid()returns true - Check that span_id is not all zeros
- Ensure span is active when request is built
- Debug by logging traceparent before sending
4. Span guard lifetime issues:
- Use
let _guard = span.enter()notlet _ = span.enter() - Underscore-prefix keeps guard alive
- Guard must live for entire scope where span should be active
Verification Techniques
1. Check active span:
let span = tracing::Span::current();
eprintln!("Current span: {:?}", span.metadata().map(|m| m.name()));
2. Verify span context validity:
use opentelemetry::trace::TraceContextExt;
use tracing_opentelemetry::OpenTelemetrySpanExt;
let context = tracing::Span::current().context();
let span = context.span();
let span_context = span.span_context();
eprintln!("Span context valid: {}", span_context.is_valid());
eprintln!("Trace ID: {:032x}", span_context.trace_id());
eprintln!("Span ID: {:016x}", span_context.span_id());
3. Inspect propagated traceparent:
// For HTTP (in request builder)
eprintln!("Injecting traceparent: {}", traceparent_header);
// For MCP (in extension injection)
eprintln!("Meta contains traceparent: {}", meta.get("traceparent").is_some());
Summary
Pattern Selection Guide
| Context | Pattern | Example |
|---|---|---|
| Regular async function | #[instrument] |
#[instrument] async fn foo() |
| Need custom OTel attributes | Manual span creation | create_chat_span() + let _guard = span.enter() |
| Spawned task (current span) | .in_current_span() |
tokio::spawn(work().in_current_span()) |
| Spawned task (specific span) | .instrument(span) |
tokio::spawn(work().instrument(reply_span)) |
| Lazy stream | Capture + enter | let s = Span::current(); try_stream! { let _g = s.enter() } |
When to Use Manual Span Creation
Use manual span creation instead of #[instrument] when:
- Need
otel.nameattribute (for custom OTel span names) - Following OpenTelemetry semantic conventions (GenAI, HTTP, etc.)
- Want different span name than function name
- Need to dynamically construct span attributes
- Require precise control over span lifecycle
Key Principles
-
Thread-local storage does NOT propagate across
.await- OpenTelemetry span context is thread-local
- Async tasks may resume on different threads
- Must use instrumentation patterns to propagate
-
Task-local storage DOES propagate across
.awaittokio::task_local!is task-scoped, not thread-scoped- Useful for session IDs and other request-scoped data
-
#[instrument]usesInstrumenttrait- Automatically propagates span across
.awaitpoints - Best for standard async functions
- Limited attribute support (no
otel.name)
- Automatically propagates span across
-
Lazy streams need capture + enter pattern
- Stream created != stream executing
- Must capture span before creation
- Enter span when stream executes
- Critical: Use
let _guard = span.enter()notlet _ = span.enter()
-
Span guards must stay alive
- Underscore-prefixed names (
_guard,_span_guard) suppress warnings - But keep the guard alive until drop
- Guard drop = span exit
- Underscore-prefixed names (
-
Cross-process propagation uses W3C Trace Context
- HTTP:
traceparentheader - MCP:
_meta.traceparentfield - Same W3C format for both
- Automatic injection during request building
- HTTP:
-
Distributed hierarchy maintained via IDs
trace_idsame across all processesspan_idunique per spanparent_span_idlinks to parent'sspan_id- Works across network boundaries
-
Validation prevents invalid propagation
- Span context must have non-zero trace_id and span_id
- Invalid contexts are silently skipped
- Remote processes start new traces if no valid context received
-
OpenTelemetry semantic conventions matter
- GenAI spans require specific attributes
- Use manual span creation for conventions compliance
- Proper naming helps in trace visualization tools