opentelemetry-go
opentelemetry-go copied to clipboard
Baggage API is not working, entangled with W3C Propagator
Description
As a user of Baggage, whether it's via OTEL Baggage API or OpenTracing Bridge, the value of a baggage entry is not supposed to be restricted, i.e. the following should be valid code:
// via OpenTracing Bridge
span.SetBaggageItem("someKey", "any value I want /#%")
// via baggage package
mem, err := baggage.NewMember("someKey", "any value I want /#%")
require.NoError(t, err)
bag.SetMember(mem)
However, the NewMember function invokes url.QueryUnescape(value) and fails if that returns an error. And even if I URL-escape the value first, when baggage is used via OpenTracing Bridge, at some point the same NewMember function is called with unescaped value and fails again.
I think this is the wrong behavior. There may be reasons to store the value internally as url-encoded so that serialization is more efficient (since baggage is generally accessed less frequently than it is passed around and serialized). However, that is the internal implementation detail, which is between Baggage and Propagator, it should not be spilling out to the API that the user is exposed to. A user should only deal with unencoded strings that makes sense in the application.
Unit test tat illustrates the issue: #3689
This seems similar to the issue #2523.
cc @williamgcampbell @NewReStarter
cc @MadVikingGod
The change was introduced in #3226. It did not provide a good justification why it needs to be this way. Referencing W3C doesn't make sense since it does not dictate in-memory representation, only wire format. From an API design of baggage, it makes more sense to allow the users to use any strings they want, and just take care of encoding them on the wire.
If people disagree (I can see a performance argument for encoding), then the API must be clearly documented to require URL-encoded values. Then the OpenTracing Bridge could be fixed to encode them when passing back to OTEL Baggage.
Looking more into this, I would certainly advocate for reverting #3226. It introduced weird asymmetry in the baggage.Member type where its constructor requires URL-encoded value, but member.Value() function returns raw value (because it's un-encoded in the constructor). So the performance argument I mentioned above doesn't even apply.
Another issue: in #2529 a decoding was added to parseMember, right before the value is checked against regexp to validate that it is "safe" per W3C. After #2529 it's the decoded value that's checked, which is incorrect, and valid strings now fail (the unit test that was added kind-of cheated by making decoded string look like URL-encoded)
@MrAlias can you shed some light on the original motivation?
From my point of view, this would be the ideal implementation:
- Users of the API are not exposed to URL-encoding, meaning any string is valid to pass to
NewMember, and the same string must be returned fromMember.Value()- Motivation: baggage API is a general-purpose context propagation mechanism. It is supposed to be completely de-coupled from the W3C spec. The latter is only concerned with wire representation, and there are other possible wire representations, such as binary encoding proposed in the original paper from Jonathan Mace that introduced the term "baggage".
- The internal storage of key/values in Member struct should also be raw, unencoded strings.
- There is a possible optimization where the baggage propagators pass an accessor function to Member instead of the actual decoded values, so that if the value is never read in a service, it never needs to be decoded.
- All URL encoding/decoding should move to the baggage propagator code, including the regex checks.
- I am not 100% those checks are even needed since the encoding/decoding functions from stdlib presumably already ensure that the encoded strings satisfy the desired character set.
Unfortunately, switching to this approach at this time is a breaking change. Fortunately, it's going to be breaking for already broken state, which was created after #3226 / #2529.
https://github.com/open-telemetry/opentelemetry-go/pull/4804 was applied to fix OpenTracing Bridge and add functions that are not entangled with W3C Propagator.
I am not closing the issue as we can still consider deprecating all the functions and methods that are entangled with W3C Propagator. Reference: https://github.com/open-telemetry/opentelemetry-go/pull/4804#issuecomment-1876601271