opentelemetry-specification
opentelemetry-specification copied to clipboard
[Prometheus] Make instrumentation scope as identifying
I think this needs updates to some of the compatibility docs (at least prometheus). Currently, Prometheus adds scope name and scope version as labels to ensure we don't end up with duplicate timeseries, and we are able to put scope attributes in a separate otel_scope_info metric. With this change, we will need to update this to instead add the scope name, scope version, and scope attributes to all metrics.
TIL schema_url is also identifying, so that is something we will need to handle in Prometheus export and other "non-OTLP" compatibility documents.
Originally posted by @dashpole in https://github.com/open-telemetry/opentelemetry-specification/issues/4161#issuecomment-2361470786
@dashpole, are you able to take this?
cc @open-telemetry/wg-prometheus
Yes, I'll take this
The idea is to add otel_scope_[attribute] and otel_scheme_url labels to metrics.
I will do my best to help with this issue.
A prototype in Go: https://github.com/open-telemetry/opentelemetry-go/pull/5947
Based on the prototype, I think this could be split into 4 sub-issues:
- Handle scheme URL as identifying
- Handle scope attributes as identifying
- ~~Define a temporary env var (e.g. for 3 months) to control
without_scope_infoexporter configuration default value (initiallyfalse, latertrue)~~ https://github.com/open-telemetry/opentelemetry-specification/issues/4223#issuecomment-2875819391 - Remove "without_scope_info" configuration (and remove scope_info)
This seems like the right path going forward. Now that these are all identifying, we CANNOT rely on joins for scope.
:EDIT: or if we did, we'd need some kind of "unique id" to perform a join based on all identifying attributes.
@open-telemetry/prometheus-interoperability, @open-telemetry/collector-maintainers: PTAL
On the Collector side, I think there are no issues as long as the scope attribute set is 1. exposed to the user, and 2. identifying on the Prometheus side, to avoid errors with time series collisions. Adding the attributes as otel_scope_[attribute] labels on the metric points themselves as proposed above sounds like the simplest way to accomplish this.
(Unrelated, but what is the distinction between sub-issues 1 and 2 above?)
(Unrelated, but what is the distinction between sub-issues 1 and 2 above?)
@jade-guiton-dd
I have forgotten to delete "and scope attributes"` from sub-issue 1 😅 Fixed 😉
I will do my best to move this issue forward. However, I will start in about 2 weeks.
In the meantime you can try prototype it on the collector side and to test everything e2e together with https://github.com/open-telemetry/opentelemetry-go/pull/5947 to make sure that there are not issues with this proposal. Having such e2e prototypes (Collector + Go) will help move the specification work forward.
I made a corresponding Prometheus issue.
@pellared By "prototyping on the Collector side", did you mean in the Prometheus exporter in collector-contrib? Based on the discussion in today's Collector SIG meeting, it sounds like this part in CONTRIBUTING.md would make this not "count" as a prototype for the purpose of changing the spec:
For this reason, when adding new features with Development maturity level, a prototype is defined as a working demonstration of the feature in a language implementation which has the support of that language's maintainers.
In other words, we may need a prototype in another language SDK's Prometheus exporter
Filed #4497 to reword the prototype requirement
@jade-guiton-dd, I think that Prometheus receiver in collector-contrib is also crucial (not only exporter).
Yes of course, everything performing translation between OTel and Prometheus will need to be updated eventually. I was just wondering about what prototypes you think need to be done to get the spec change moving.
Re-reading your message, I think I misunderstood. By "e2e prototypes (Collector + Go)", did you mean a single prototype where the draft PR modifying the SDK exporter is integrated into the Collector to test that it fixes our issue when producing scope attributes?
did you mean a single prototype where the draft PR modifying the SDK exporter is integrated into the Collector to test that it fixes our issue when producing scope attributes?
No. I just mean building 3 prototypes: Go SDK Prometheus exporter, Collector Prometheus receiver, Collector Prometheus exporter and check if they will work together correctly (even "manually"). I just think it would be nice to check all components starting from an instrumented application, through Collector, finishing in Prometheus. I am fine with any other validation/testing approach.
I updated the OTel Go's exporter implementation here: https://github.com/open-telemetry/opentelemetry-go/pull/5947. PTAL
PoC for Prometheus exporter in collector-contrib: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40004
Reviews appreciated, especially in test coverage of edge cases :)
@ArthurSens, are you going to also work on prototype for prometheusreceiver?
Given https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40004#issue-3057707170:
Turns out our exporter never exposed otel_scope_info, so this PR just introduces the addition of the new labels.
and https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/3c6a5ea6430afaac7c33a25ff276903ef829693b/receiver/prometheusreceiver#resource-and-scope:
It drops the
otel_scope_infometric, and uses attributes (other thanotel_scope_nameandotel_scope_version) to populate Scope Attributes.
It looks like we can drop point 3 from https://github.com/open-telemetry/opentelemetry-specification/issues/4223#issuecomment-2839595942 completely (otel_scope_info metric is useless)
@ArthurSens, are you going to also work on prototype for prometheusreceiver?
I spent some time yesterday and it was not as easy as I expected, but I'm also not 100% familiar with that codebase 😬
I plan to spend more time on it this week, but it might take a while to have a PoC
I created a specification PR. PTAL:
- https://github.com/open-telemetry/opentelemetry-specification/pull/4505
PoCs in collector are ready for review:
- https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40060
- https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40004