Prometheus translation with UTF-8 and `__`
- Multiple consecutive
_characters SHOULD be replaced with a single_character. - I have just added UTF8 support to the java client: https://github.com/open-telemetry/opentelemetry-java/pull/7588
- Now the prom converter in the Java SDK doesn't change
%%to_any more - because underscore escaping is done at scrape time- and at scrape time, the OTel 2 prom converter is not used
- Technically, this is a breaking change
- This should be the same issue in all SDKs (if they implemented the "multiple chars" part mentioned above)
- Should clients add a flag utf8 support?
- when disabled, the client would convert non-legacy chars to
_- and could thus be able to replace__with_
- when disabled, the client would convert non-legacy chars to
Was last discussed in https://github.com/open-telemetry/opentelemetry-specification/pull/4533
Well, the spec says SHOULD, not MUST. So technically, an SDK is still compliant even if multiple underscores are kept between words. However, for SHOULD, usually folks provide an explanation why they are intentionally not complying. If java decides to not comply, it would be nice to have some sort of docs with the explanation
If you want to implement this, maybe our Go implementation can inspire you? We do a string split to later join with a single _ character
If you want to implement this, maybe our Go implementation can inspire you? We do a string split to later join with a single
_character
This doesn't work for Java and I'm not sure why it does for Go:
- we pass the chars to the prom client as unicode, so there's no escaping hat that point
- at scrape time, escaping can occur, but this is done it the client, so after the otel 2 prom conversion
For Go, we do the escaping before registering the metric, so it's not during scrape time.
Takeaway from meeting:
Users usually want to control translation on the ingestion side rather than at scrape time, so it's good to keep a "utf8 support" setting in OTel SDKs.