opentelemetry-java
opentelemetry-java copied to clipboard
W3C Baggage Propagator should not percent-encode metadata
Context
The W3C working group for distributed tracing is aiming to move the W3C baggage specification from candidate recommendation stage to an actual recommendation. As part of this, we were reviewing existing implementations of the specification and check them for compliance.
Note: There is a difference between the W3C baggage spec and the OTel baggage spec in that the W3C spec supports a set of additional properties on baggage entries (the actual baggage key-value pairs). OTel calls this concept metadata but treats all metadata on a baggage entry as one opaque string.
Describe the bug
The opentelemetry-java implementation applies percent encoding to the metadata of baggage entries. This potentially breaks compatibility with other participants in the same distributed transaction. The W3C Baggage Propagator should not percent-encode metadata, but instead do no processing at all on it.
Here is a more in-depth explanation why that encoding is problematic and might break other implementations of the W3C baggage spec. Consider the following baggage header:
baggage: SomeKey=SomeValue;ValueProp \t = \t PropVal
An implementation that is compliant with the W3C spec would parse this into one baggage entry (with key SomeKey
, value SomeValue
) with one metadata property with key-value shape (key ValueProp
and value PropVal
).
If the Java OTel SDK receives this baggage header and then propagates it downstream, it will change the header content as follows (more precisely, this is the string after one extract
& inject
roundtrip):
baggage: SomeKey=SomeValue;ValueProp%20%09%20%3D%20%09%20PropVal"
(This is even tested for in this test case: https://github.com/open-telemetry/opentelemetry-java/blob/main/api/all/src/test/java/io/opentelemetry/api/baggage/propagation/W3CBaggagePropagatorTest.java#L442-L461)
That is, the inner whitespace and the equals sign in the property have been percent-encoded.
Now, when this header value is receive by an implementation that is compliant with the W3C baggage spec, it would be parsed as follows: One baggage entry (same as above) with one property which no longer has key-value shape but is now a flag-like property ValueProp%20%09%20%3D%20%09%20PropVal
. That is, the key-value metadata property has become a single string.
Please note that neither the W3C spec nor the OTel spec mandate to percent-encode properties/metadata. In fact the OTel spec explictly says
Metadata: Optional metadata associated with the name-value pair. This should be an opaque wrapper for a string with no semantic meaning. Left opaque to allow for future functionality.
I read this as "do not touch this string at all".
In its current form, the W3C spec mandates to percent-encode baggage entry values, but metadata properties are separate from that.
Steps to reproduce
The test mentioned above reproduces the behavior.
Additionally, I ported the W3C baggage spec test test suite from Python over to Java to verify the Java OTel SDK implementation. The result is here in case you are interested: https://github.com/basti1302/opentelemetry-java/blob/w3c-baggage-issue-138-audit-otel-java-sdk/api/all/src/test/java/io/opentelemetry/api/baggage/propagation/W3cSpecTest.java
What did you expect to see? Baggage metadata should not be percent-encoded.
What did you see instead?
Baggage metadata being percent-encoded after one extract
and inject
round trip.
What version and what artifacts are you using?
The current main
branch of this repo. (commit eb53fe3a614a5feb6ddd74d93b43722a0033fffe)
Additional context
In contrast to the behavior in Java, the JS OTel SDK does not process metadata at all.
See this test: https://github.com/open-telemetry/opentelemetry-js/blob/main/packages/opentelemetry-core/test/baggage/W3CBaggagePropagator.test.ts#L40-L62
@dyladan (who maintains opentelemetry-js and also participates in the W3C working group) commented that the decision for treating metadata as one opaque string within the OTel spec was deliberately designed because at that time it was not completely clear how baggage properties would evolve in the W3C spec. Unfortunately, percent-encoding defeats that purpose.
There is also a corresponding issue in the W3C baggage repository: https://github.com/w3c/baggage/issues/138