opentelemetry-go-contrib icon indicating copy to clipboard operation
opentelemetry-go-contrib copied to clipboard

B3 context propagator missing parentspanid inject

Open albertteoh opened this issue 4 years ago • 9 comments
trafficstars

Background

I want to instrument (tracing) my istio services with otel-go.

The requirement to achieve this is to simply copy the inbound B3 HTTP request headers to the outbound HTTP request headers and Istio will take care of the rest; i.e. propagate B3 trace context, as described here: https://istio.io/latest/docs/tasks/observability/distributed-tracing/overview/

The approach I've taken is to extract the context from the HTTP headers:

propagator := otel.GetTextMapPropagator()

// Note: The span context is stored under the "remoteContextKey"
ctx := propagator.Extract(r.Context(), propagation.HeaderCarrier(r.Header))

Then passing this context into my outbound call and using the propagator's Inject method:

propagator := otel.GetTextMapPropagator()
provider := sdktrace.NewTracerProvider()
tracer := provider.Tracer("zipkin-propagator")

// This is necessary to copy the extracted span context (above) from the "remoteContextKey"  into the "currentSpanKey"
// because propagator.Inject searches for the spanContext from this "currentSpanKey".
ctx, span := tracer.Start(ctx, "something")
defer span.End()

propagator.Inject(ctx, propagation.HeaderCarrier(request.Header))

Problem

The outbound child span isn't nested under the inbound request parent span: Screen Shot 2021-04-09 at 10 16 37 pm

What I expect to see is something like: Screen Shot 2021-04-09 at 10 45 36 pm

Cause

  • propagator.Inject doesn't set the "x-b3-parentspanid" header resulting in loss of the causal relationship between spans.
  • This may be because SpanContext doesn't have a reference to the parent span ID.

Questions

  • Is the missing parent span ID in SpanContext by design? I noticed there's the concept of Links though I don't understand how to apply it to resolve this issue.
  • Any ideas how I can resolve this problem?

albertteoh avatar Apr 09 '21 13:04 albertteoh

The OpenTelemetry SpanContext does not have a way to store parent span information. This will need to be addressed at the specification level before we can resolve this.

MrAlias avatar Apr 09 '21 15:04 MrAlias

The B3 propagator intentionally does not inject the x-b3-parentspanid field as the specification forbids it.

In your test setup, have you configured an exporter from the application that is using the B3 propagator? It looks like what might be happening is that you are starting a span in your application, making that span the parent for the span generated by Istio's proxy outbound from the service. If the span from within the application isn't exported then the tree will be broken in the way seen here.

There have been changes to how remote SpanContexts are handled in the API that haven't yet been released that may make it so that you could skip creating your own tracer and starting and exporting a span if you merely wanted to propagate the context.

Aneurysm9 avatar Apr 09 '21 15:04 Aneurysm9

Thank you very much @MrAlias @Aneurysm9!

I did try upgrading otel-go to pick up those recent changes, which sound promising as indeed, all I want to achieve is to simply propagate the context unmodified. However, there were some compilation errors relating to trace.FlagsDebug and trace.FlagsDeferred missing in the B3 propagators.

I'll park this effort for now and fallback to using Jaeger propagators or manually copying HTTP headers to propagate context.

albertteoh avatar Apr 20 '21 09:04 albertteoh

I did try upgrading otel-go to pick up those recent changes, which sound promising as indeed, all I want to achieve is to simply propagate the context unmodified. However, there were some compilation errors relating to trace.FlagsDebug and trace.FlagsDeferred missing in the B3 propagators.

The B3 propagator has been updated in the latest release to work with the new SpanContext that removed these flags. If you are still able to validate the proposed solution, please take another look.

MrAlias avatar May 04 '21 15:05 MrAlias

Hi, I'm looking into this through the Consul side of things. I see the same problems with the spans not being nested properly. I'm wondering how zipkin could possibly work without the x-b3-parentspanid headers being propagated?

lkysow avatar Dec 10 '21 20:12 lkysow

for istio I got it working using b3 multi header as the propagator https://github.com/fortio/otel-sample-app

ldemailly avatar Jan 02 '23 23:01 ldemailly

@Aneurysm9

The B3 propagator intentionally does not inject the x-b3-parentspanid field as the specification forbids it.

Parent span is not propagated. That is fine. But client span is to be used as parent span later on. Let's say I want to write a proxy server, which forwards an incoming HTTP request.

  • The client sends B3: <traceId>-<**clientSpanId**>-1-<parentSpanId>; where parentSpanId may be missing, if the client is the root.
  • My proxy receives that, and expected to send B3: <traceId>-<newSpanId>-1-<**clientSpanId**>; where the new parent span ID is the client's original span ID.

How do I make that happen? It seems to me that the propagator does not add the parent span. Therefore message order/relations are ambiguous.

Som-Som-CC avatar Feb 24 '23 20:02 Som-Som-CC

Hi @albertteoh did you manage to fix that?

Facing similar issue for some reason when I enable tracing in Istio to be propagated to Otel the B3 propagation headers disappeared after the trace export then don't show the headers in the logs.

x-envoy-internal: true
x-request-id: ec219ebe-5bfd-918f-a8aa-ff14b5843aa7
traceparent: 00-8910de62a422f5c3035b2c085416c4aa-d163a03858a88f82-01
tracestate: 

kubectl logs -l app=person-service -f -c istio-proxy
[time] "GET /api/people HTTP/1.1" 401 - ... inbound|8082|| 127.0.0.6:56425 10.42.2.74:8082 10.42.0.0:0 - default traceID=- spanID=- parentSpanID=-

If I use default tracing Jaeger from Istio headers are propagated fine.

x-envoy-internal: true
x-request-id: fea545eb-8b8a-9f67-b00a-7a3ad894f419
x-b3-traceid: fc4ffd7e645251b7c06de0b3d6908565
x-b3-spanid: c06de0b3d6908565
x-b3-sampled: 1

kubectl logs -l app=person-service -f -c istio-proxy
[datetime] "GET /api/people HTTP/1.1" 401 - ... 10.42.0.0:0 - default traceID=fc4ffd7e645251b7c06de0b3d6908565 spanID=c06de0b3d6908565 parentSpanID=-

rodrigorodrigues avatar Nov 02 '23 22:11 rodrigorodrigues

Hi @albertteoh did you manage to fix that?

Hi @rodrigorodrigues, sorry, I haven't looked into this for a long time and, from what I recall, I didn't get it working with OTEL SDK at the time; but I'm sure things have changed since then.

albertteoh avatar Nov 03 '23 10:11 albertteoh