opentelemetry-specification icon indicating copy to clipboard operation
opentelemetry-specification copied to clipboard

[Prometheus] Make instrumentation scope as identifying

Open pellared opened this issue 1 year ago • 2 comments
trafficstars

I think this needs updates to some of the compatibility docs (at least prometheus). Currently, Prometheus adds scope name and scope version as labels to ensure we don't end up with duplicate timeseries, and we are able to put scope attributes in a separate otel_scope_info metric. With this change, we will need to update this to instead add the scope name, scope version, and scope attributes to all metrics.

TIL schema_url is also identifying, so that is something we will need to handle in Prometheus export and other "non-OTLP" compatibility documents.

Originally posted by @dashpole in https://github.com/open-telemetry/opentelemetry-specification/issues/4161#issuecomment-2361470786

pellared avatar Sep 24 '24 15:09 pellared

@dashpole, are you able to take this?

pellared avatar Sep 24 '24 15:09 pellared

cc @open-telemetry/wg-prometheus

Yes, I'll take this

dashpole avatar Sep 24 '24 15:09 dashpole

The idea is to add otel_scope_[attribute] and otel_scheme_url labels to metrics.

I will do my best to help with this issue.

pellared avatar Oct 31 '24 20:10 pellared

A prototype in Go: https://github.com/open-telemetry/opentelemetry-go/pull/5947

pellared avatar Dec 17 '24 12:12 pellared

Based on the prototype, I think this could be split into 4 sub-issues:

  1. Handle scheme URL as identifying
  2. Handle scope attributes as identifying
  3. ~~Define a temporary env var (e.g. for 3 months) to control without_scope_info exporter configuration default value (initially false, later true)~~ https://github.com/open-telemetry/opentelemetry-specification/issues/4223#issuecomment-2875819391
  4. Remove "without_scope_info" configuration (and remove scope_info)

pellared avatar Apr 29 '25 16:04 pellared

This seems like the right path going forward. Now that these are all identifying, we CANNOT rely on joins for scope.

:EDIT: or if we did, we'd need some kind of "unique id" to perform a join based on all identifying attributes.

jsuereth avatar Apr 29 '25 18:04 jsuereth

@open-telemetry/prometheus-interoperability, @open-telemetry/collector-maintainers: PTAL

pellared avatar Apr 29 '25 19:04 pellared

On the Collector side, I think there are no issues as long as the scope attribute set is 1. exposed to the user, and 2. identifying on the Prometheus side, to avoid errors with time series collisions. Adding the attributes as otel_scope_[attribute] labels on the metric points themselves as proposed above sounds like the simplest way to accomplish this.

(Unrelated, but what is the distinction between sub-issues 1 and 2 above?)

jade-guiton-dd avatar Apr 30 '25 09:04 jade-guiton-dd

(Unrelated, but what is the distinction between sub-issues 1 and 2 above?)

@jade-guiton-dd

I have forgotten to delete "and scope attributes"` from sub-issue 1 😅 Fixed 😉

I will do my best to move this issue forward. However, I will start in about 2 weeks.

In the meantime you can try prototype it on the collector side and to test everything e2e together with https://github.com/open-telemetry/opentelemetry-go/pull/5947 to make sure that there are not issues with this proposal. Having such e2e prototypes (Collector + Go) will help move the specification work forward.

pellared avatar Apr 30 '25 11:04 pellared

I made a corresponding Prometheus issue.

aknuds1 avatar May 06 '25 07:05 aknuds1

@pellared By "prototyping on the Collector side", did you mean in the Prometheus exporter in collector-contrib? Based on the discussion in today's Collector SIG meeting, it sounds like this part in CONTRIBUTING.md would make this not "count" as a prototype for the purpose of changing the spec:

For this reason, when adding new features with Development maturity level, a prototype is defined as a working demonstration of the feature in a language implementation which has the support of that language's maintainers.

In other words, we may need a prototype in another language SDK's Prometheus exporter

jade-guiton-dd avatar May 07 '25 12:05 jade-guiton-dd

Filed #4497 to reword the prototype requirement

mx-psi avatar May 07 '25 12:05 mx-psi

@jade-guiton-dd, I think that Prometheus receiver in collector-contrib is also crucial (not only exporter).

pellared avatar May 07 '25 14:05 pellared

Yes of course, everything performing translation between OTel and Prometheus will need to be updated eventually. I was just wondering about what prototypes you think need to be done to get the spec change moving.

Re-reading your message, I think I misunderstood. By "e2e prototypes (Collector + Go)", did you mean a single prototype where the draft PR modifying the SDK exporter is integrated into the Collector to test that it fixes our issue when producing scope attributes?

jade-guiton-dd avatar May 07 '25 14:05 jade-guiton-dd

did you mean a single prototype where the draft PR modifying the SDK exporter is integrated into the Collector to test that it fixes our issue when producing scope attributes?

No. I just mean building 3 prototypes: Go SDK Prometheus exporter, Collector Prometheus receiver, Collector Prometheus exporter and check if they will work together correctly (even "manually"). I just think it would be nice to check all components starting from an instrumented application, through Collector, finishing in Prometheus. I am fine with any other validation/testing approach.

pellared avatar May 07 '25 16:05 pellared

I updated the OTel Go's exporter implementation here: https://github.com/open-telemetry/opentelemetry-go/pull/5947. PTAL

pellared avatar May 09 '25 08:05 pellared

PoC for Prometheus exporter in collector-contrib: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40004

Reviews appreciated, especially in test coverage of edge cases :)

ArthurSens avatar May 12 '25 17:05 ArthurSens

@ArthurSens, are you going to also work on prototype for prometheusreceiver?

pellared avatar May 13 '25 09:05 pellared

Given https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40004#issue-3057707170:

Turns out our exporter never exposed otel_scope_info, so this PR just introduces the addition of the new labels.

and https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/3c6a5ea6430afaac7c33a25ff276903ef829693b/receiver/prometheusreceiver#resource-and-scope:

It drops the otel_scope_info metric, and uses attributes (other than otel_scope_name and otel_scope_version) to populate Scope Attributes.

It looks like we can drop point 3 from https://github.com/open-telemetry/opentelemetry-specification/issues/4223#issuecomment-2839595942 completely (otel_scope_info metric is useless)

pellared avatar May 13 '25 09:05 pellared

@ArthurSens, are you going to also work on prototype for prometheusreceiver?

I spent some time yesterday and it was not as easy as I expected, but I'm also not 100% familiar with that codebase 😬

I plan to spend more time on it this week, but it might take a while to have a PoC

ArthurSens avatar May 13 '25 09:05 ArthurSens

I created a specification PR. PTAL:

  • https://github.com/open-telemetry/opentelemetry-specification/pull/4505

pellared avatar May 13 '25 11:05 pellared

PoCs in collector are ready for review:

  • https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40060
  • https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40004

ArthurSens avatar May 25 '25 12:05 ArthurSens