client_golang icon indicating copy to clipboard operation
client_golang copied to clipboard

Optionally add OM unit

Open vesari opened this issue 2 years ago • 23 comments

This PR adds support for unit for Open Metrics. It is ready for review, but not ready to merge (see my long comment below). Fixes #684.

vesari avatar Nov 26 '23 13:11 vesari

Hey @vesari , just passing by to ask if there is anything we could do to unblock you 👋

ArthurSens avatar Mar 15 '24 10:03 ArthurSens

Hey @vesari , just passing by to ask if there is anything we could do to unblock you 👋

@ArthurSens hello! I think I'm just using a Go version which is too new, I plan on fixing that later today. Thanks a lot for checking on me, much appreciated! :)

vesari avatar Mar 15 '24 11:03 vesari

@ArthurSens hello again! Here are a few considerations to give you some insight about what I’ve done here and why. I had to upgrade Go in the GitHub workflows so that the version is at least 1.21, as this is needed by common 0.50.0 which, as you know, contains the necessary implementation for unit. I also changed the Makefile and the go.mod and main.go files in the tutorial folder accordingly. I have not touched the .bingo file but I guess the version should be upgraded there too. However, Tests(1.20) is still expected to run among the checks, and, as you can see, just hangs, but I suppose that’s because of the repo settings, so I’ll need your help there.

The lint was failing on a supposedly unused test function unrelated to my changes (withDebugMetrics), so I removed it. Let me know if I should instead just have the lint ignore it.

I made the changes needed where an OM format can be passed to expfmt.NewEncoder, and also all the others stemming from the field Unit being added to the type prometheus.Desc. In some cases, I chose to populate this field with the unit which was already included in the metric family name, but I could have also just put an empty string, and I’m not sure what the preferred behaviour would be or if it doesn’t make any difference. I’m talking especially about the various collectors files, like prometheus/collector/dbstats_collector.go , prometheus/go_collector.go or prometheus/go_collector_latest.go etc. I’d be grateful if you could advise me on this.

Also, so far I haven’t added any new tests to promhttp/http_test.go, considering not even the EnableOpenMetrics option has been tested there. Let me know if I should.

Lastly, I happened to push a lot of commits, let me know if you want me to rebase and squash them.

As per usual, many thanks for your time and help! :)

vesari avatar Mar 15 '24 13:03 vesari

Oh sorry for the silence here, somehow I missed your last comment.

I had to upgrade Go in the GitHub workflows so that the version is at least 1.21, as this is needed by common 0.50.0 which, as you know, contains the necessary implementation for unit.

Yeah, that's a problem actually. We promise support for the 3 latest go versions, so we can't bump our go version to 1.21 until mid-August when go 1.23 is released. I've started conversations about how we can improve this situation with common.

The lint was failing on a supposedly unused test function unrelated to my changes (withDebugMetrics), so I removed it. Let me know if I should instead just have the lint ignore it.

This is another problem that I've been delaying for way too long 😬 From my understanding it doesn't block PRs though. But anyway, I'll try to get back to this soon, no need to remove it now :)


Regarding the changes that are related to OpenMetrics units, I haven't taken a deep look yet 😬. I'm mostly concerned with enabling the PR first and will review the changes once we solve all the blockers

ArthurSens avatar Mar 28 '24 11:03 ArthurSens

Thanks a lot for the detailed update! Much appreciated

vesari avatar Mar 29 '24 08:03 vesari

Hi @vesari, all blockers have been resolved finally :)

The unused function is now used, and we can update prometheus/common without giving up support of go 1.20. If you rebase your PR it should be fine :)

ArthurSens avatar Apr 11 '24 17:04 ArthurSens

Awesome! Thank you very much! I'm on it.

vesari avatar Apr 13 '24 07:04 vesari

@ArthurSens , I've marked this as "ready for review", but I'm still assuming your PR should be merged first?

vesari avatar May 07 '24 07:05 vesari

I realized we already kind of agreed to supporting this... https://github.com/prometheus/client_golang/issues/684 so probably too late to challenge it cc @beorn7 @ArthurSens - do you remember what's the benefit other than "compatibility"?

The friction comment and breaking change stays. I feel we could do better than adding Unit field and asking everybody to duplicate this information 🤔

Perhaps some automation AND clear guidance e.g. in comment to NOT add it if you have correct unit in name would help?

bwplotka avatar May 09 '24 11:05 bwplotka

@bwplotka thanks, I'll await @beorn7 and @ArthurSens 's feedback on your questions, as they know the original motivation :)

vesari avatar May 09 '24 13:05 vesari

I realized we already kind of agreed to supporting this... https://github.com/prometheus/client_golang/issues/684 so probably too late to challenge it cc @beorn7 @ArthurSens - do you remember what's the benefit other than "compatibility"?

At this point, I believe the main benefit is only compatibility with OpenMetrics 1.0, yes. I totally understand that OpenMetrics 1.0 has many details that hinge adoption in the Prometheus ecosystem, but I think we can't ignore that OM 1.0 exists and is used out there.

Prometheus already parses OM units, and stores it as metadata. So my rationale to accept this was to finally close the gap between go applications that wanted this information in Prometheus.

With that said, I admit I'm not aware of how units are used in Prometheus besides easy identification when a human reads PromQL. I was not part of the team when OpenMetrics became a thing, but I had the impression that units were added to the data model for a reason, and that work was being done (or planned to) to make use of units when making operations between metrics that were exposed with different units, e.g. http_requests_duration_seconds + http_requests_duration_miliseconds. I recognize that I could have done a better job here and questioned my own assumptions, but I can also see how units are important for Prometheus now that we're investing in OTLP compatibility. The metric data model for OTLP uses Unit as a metric identifier, which shows the importance for us to support units in client_golang.

ArthurSens avatar May 10 '24 17:05 ArthurSens

If we decide to support units, let's think of alternative that isn't breaking changes to downstream projects :)

Of course! I'll await your decision whether to support them and, if so, I'll be happy to continue working on it the way you suggested :)

vesari avatar May 11 '24 08:05 vesari

I realized we already kind of agreed to supporting this... https://github.com/prometheus/client_golang/issues/684 so probably too late to challenge it cc @beorn7 @ArthurSens - do you remember what's the benefit other than "compatibility"?

It's hard for me to give a fair judgement here (because of my difficult history with OM). Having said that, the OM approach to units strikes me as weird.

I understand the historical Prometheus approach of "including the unit in the metric name" as a pragmatic solution to the lack of proper metadata support in Prometheus. It should be noted that it always has been just a convention.

OM added the unit metadata field, presumably because others asked for it (the OTel data model didn't exist back then, but I assume that the presence of a unit field in OTel comes from the same school of thinking). Now you might think, if OM has it as an explicit field, you don't have to include it in the metric name anymore. But OM did the opposite: If there is a unit field, the same string MUST be in the metric name, otherwise it is not valid OM. Personally, I think it should be possible to include the unit in the metric name (for those that like that for all the situation where you do not want to lookup the metadata even if you could), but given that others want to live in a world where metadata is always available at your fingertips, enforcing it in the metric name feels redundant.

Sadly, OTel took its guidance for Prometheus compatibility from OM, so the way the OTel data model is converted to Prometheus is already following the OM requirements, although there was never a Prometheus consensus about it.

Looping back to the question: How should we deal with the situation in client_golang? Given how far unit support has already seeped into various parts of the wider ecosystem, it's tough to say "this is all f*** up, let's just ignore it". I think we should make it possible to expose the unit field for those that want it, but I would, for now, not enforce that the unit is also in the metric name (therefore accepting that we might generate invalid OM 1.0, but anticipating that this requirement is lifted in OM 2.0). It could be a note in the doc comment, like: "If you want to create valid OM 1.0, include the unit in the metric name, but this is not enforced." (I guess the OM vision was that you specify the name without the unit in instrumentation, then specify the unit separately (but only once), and then the actually exposed metric name includes the metric?)

How "1st class" adding the unit should be, is up to debate. I definitely wouldn't require all users to now add a unit. It should be an option for those that want it.

WRT to NewDesc: I once had an ambitious plan to get rid of (user-visible) Desc, see also #222 for issues with Desc. Not sure how to add unit without making everything even worse. Maybe NewDescWithUnit is OK for now.

beorn7 avatar May 14 '24 16:05 beorn7

Thank you all very much for your considerations and suggestions! In order not to enforce that the unit is also in the metric name when the unit field is populated, the logic in common has to be modified accordingly, by reverting some of the most recent changes that enforce exactly that. Of course, I'll be happy to do so. After that, I can work on the NewDescWithUnit approach for NewDesc. Let me know if I can proceed @ArthurSens @bwplotka @beorn7 :)

vesari avatar May 16 '24 10:05 vesari

To be clear: @ArthurSens @bwplotka @kakkoyun should make the call here. I'm just providing my personal context.

beorn7 avatar May 16 '24 10:05 beorn7

Hi @vesari!

The team met recently and one of the things we discussed was whether we accept OpenMetrics units or not. A TL;DR; is: We won't accept this until Prometheus has a clear use case for unit metadata.


Reading the OpenMetrics specification, we can read[1][2]:

A MetricFamily MAY have zero or more Metrics. A MetricFamily MUST have a name, HELP, TYPE, and UNIT metadata. Every Metric within a MetricFamily MUST have a unique LabelSet.
.
.
.
Unit specifies MetricFamily units. If non-empty, it MUST be a suffix of the MetricFamily name separated by an underscore. Be aware that further generation rules might make it an infix in the text format.

We clearly understand that units are required by OpenMetrics 1.0, and we clearly understand that units must be used as metric name suffixes. However, at this point, if we accept this the only benefit would be to say "We're OpenMetrics 1.0 compatible". Being OpenMetrics 1.0 compatible is not a bad thing per se, but Prometheus hasn't used the Unit metadata for anything yet, so it brings Client_golang users little benefit while making our API a bit more complex.

If in the future Prometheus introduces the usage of Unit metadata, e.g. to properly sum metrics with different but compatible units (one in seconds and the other in milliseconds), we'd be happy to resurrect this.

In the name of the team, we apologize for taking this long to make a decision. Your work was quite involved, spreading contributions across multiple repositories and communicating with several stakeholders. We'd be happy to continue working with you if that's also of your interest :)

ArthurSens avatar May 19 '24 14:05 ArthurSens

Maybe we could start brainstorming how to use the unit metadata in Prometheus? Although this would be a high-complexity project that I would also need a lot of time to understand properly😬

ArthurSens avatar May 19 '24 14:05 ArthurSens

Hi @vesari!

The team met recently and one of the things we discussed was whether we accept OpenMetrics units or not. A TL;DR; is: We won't accept this until Prometheus has a clear use case for unit metadata.

Reading the OpenMetrics specification, we can read[1][2]:

A MetricFamily MAY have zero or more Metrics. A MetricFamily MUST have a name, HELP, TYPE, and UNIT metadata. Every Metric within a MetricFamily MUST have a unique LabelSet.
.
.
.
Unit specifies MetricFamily units. If non-empty, it MUST be a suffix of the MetricFamily name separated by an underscore. Be aware that further generation rules might make it an infix in the text format.

We clearly understand that units are required by OpenMetrics 1.0, and we clearly understand that units must be used as metric name suffixes. However, at this point, if we accept this the only benefit would be to say "We're OpenMetrics 1.0 compatible". Being OpenMetrics 1.0 compatible is not a bad thing per se, but Prometheus hasn't used the Unit metadata for anything yet, so it brings Client_golang users little benefit while making our API a bit more complex.

If in the future Prometheus introduces the usage of Unit metadata, e.g. to properly sum metrics with different but compatible units (one in seconds and the other in milliseconds), we'd be happy to resurrect this.

In the name of the team, we apologize for taking this long to make a decision. Your work was quite involved, spreading contributions across multiple repositories and communicating with several stakeholders. We'd be happy to continue working with you if that's also of your interest :)

@arthursens, no worries at all, I totally understand. Of course I’d be delighted to continue working with the team, thanks for suggesting that!

vesari avatar May 20 '24 05:05 vesari

Maybe we could start brainstorming how to use the unit metadata in Prometheus? Although this would be a high-complexity project that I would also need a lot of time to understand properly😬

Either this or anything the team deem useful. Whatever you think needs attention, I’ll be happy to work on :) Feel free to DM me on Slack if necessary. Thank you!

vesari avatar May 20 '24 05:05 vesari

As agreed with @bwplotka, we want to add units eventually, but this needs more thought on how to avoid adding significant boilerplate to all users for that; help wanted, closing for now.

vesari avatar Jul 31 '24 10:07 vesari

We might want to ressurect this PR now that https://github.com/prometheus/proposals/pull/39 adds a purpose to units in Prometheus :)

ArthurSens avatar Jun 09 '25 21:06 ArthurSens

I'll be happy to do so

vesari avatar Jun 10 '25 11:06 vesari

@kakkoyun ignore the request for review for now, as it was sent automatically once I reopened the PR. There's still a lot to catch up on here and work to do.

vesari avatar Jun 10 '25 11:06 vesari