opentelemetry-specification icon indicating copy to clipboard operation
opentelemetry-specification copied to clipboard

Request for adoption in OTel SDKs of the W3C Trace Context Level 2 spec (which is now in Candidate Recommendation status)

Open kalyanaj opened this issue 2 years ago • 10 comments
trafficstars

What are you trying to achieve? As part of the W3C DT group, today we published the Level 2 (aka version 2) of the W3C Trace Context spec in "Candidate Recommendation" (CR) status: https://www.w3.org/TR/trace-context-2/.

Please see the above link for full details, to summarize here a main change from the Level 1 version (which OpenTelemetry has already adopted) is that the Level 2 spec includes a new trace flag in the traceparent header called the "random trace id" flag. When this flag is set, it conveys that at least the right-most 7 bytes of the trace ID have been generated in a random (or pseudo-random) manner. This can be helpful for samplers / sharding logic etc. as with this they can get stronger guarantees that traceID has been generated in a random/pseudo-random manner.

What did you expect to see? Based on the above, want to update the OpenTelemetry community on this spec for your evaluation and to consider prototyping/adoption in OTel SDKs. Assuming the traceID is already being generated in a pseudo-random manner, this would involve setting the above new trace flag to reflect it.

Note: The exit criteria for this standard moving from the current "Candidate Recommendation" status to "Recommendation" status is to have at least two implementations.

Additional context. https://www.w3.org/TR/trace-context-2/ has the details.

kalyanaj avatar Apr 18 '23 19:04 kalyanaj

The concrete changes here that are relevant for OTel:

  1. if trace id is randomly generated (most if not all SDKs) set the random flag
  2. propagate the random flag if a traceparent is received

dyladan avatar Apr 25 '23 12:04 dyladan

  1. propagate the random flag if a traceparent is received

~Isn't propagation of unknown flags already required by the previous W3C spec? Would still be good to verify I guess.~

Late EDIT: This is wrong. See comment below https://github.com/open-telemetry/opentelemetry-specification/issues/3411#issuecomment-1525548085

Oberon00 avatar Apr 25 '23 13:04 Oberon00

if trace id is randomly generated (most if not all SDKs) set the random flag

I wonder if we should strongly suggest all SIGs do generate random ids (if still there any SIG that doesn't).

carlosalberto avatar Apr 26 '23 13:04 carlosalberto

3. propagate the random flag if a traceparent is received

Isn't propagation of unknown flags already required by the previous W3C spec? Would still be good to verify I guess.

no. unknown flags SHOULD be set to 0

dyladan avatar Apr 27 '23 11:04 dyladan

I believe we should add flags to the protocol for Span context and links. Otherwise, a tail sampler and downstream consumer will not be able to know whether the TraceID has the random flag.

Same issue is written at least once: https://github.com/open-telemetry/opentelemetry-proto/issues/382 (I'm pretty sure there's another copy of this written somewhere else).

jmacd avatar Aug 04 '23 17:08 jmacd

Now that the following is done:

  • https://github.com/open-telemetry/opentelemetry-proto/pull/503
  • https://github.com/open-telemetry/opentelemetry-proto/pull/523

can this item, W3C Trace Context Level 2, be adopted in the specifications ?

Looking to implement it for opentelemetry-cpp.

marcalff avatar Jan 12 '24 22:01 marcalff

~~@marcalff that can only be decided by OpenTelemetry which is a different entity. I suggest you add your voice to https://github.com/open-telemetry/opentelemetry-specification/issues/3411~~

edit: I was going through my notifications and thought this was a question on the w3c GitHub. Ignore this message

dyladan avatar Jan 30 '24 18:01 dyladan

The Trace Context level 2 is in candidate recommendation phase and the W3C is actively looking for adopters. It would be very helpful to us (w3c working group) if OpenTelemetry (including opentelemetry-cpp) can implement this at least experimentally to ensure it works as intended.

dyladan avatar Jan 30 '24 18:01 dyladan

The Trace Context level 2 is in candidate recommendation phase and the W3C is actively looking for adopters. It would be very helpful to us (w3c working group) if OpenTelemetry (including opentelemetry-cpp) can implement this at least experimentally to ensure it works as intended.

There is a chicken and eggs problem with the process then.

Preparation work has been done already:

https://github.com/open-telemetry/opentelemetry-cpp/blob/497eaf43e5676ae7982f7119a2ac70d09211a6f5/sdk/src/trace/tracer.cc#L83-L93

I am waiting for the opentelemetry-specifications to change, to mention that supporting level 2 is ok (experimental maybe), to uncomment that code and actually support level 2 in the implementation.

marcalff avatar Jan 31 '24 09:01 marcalff

Currently implementations use either the right most 63 or 64 bits for sampling. We should all adopt 56 bit but only once this is adopted?

tsloughter avatar Aug 01 '24 17:08 tsloughter