Serialize resource attributes in target info for PrometheusHttpServer

Open jack-berg opened this issue 2 years ago • 11 comments

Resolves #4552.

Jun 29 '22 21:06 jack-berg

Codecov Report

Merging #4569 (c723dc9) into main (cb864fd) will decrease coverage by 0.00%. The diff coverage is 88.88%.

@@             Coverage Diff              @@
##               main    #4569      +/-   ##
============================================
- Coverage     90.04%   90.04%   -0.01%     
- Complexity     5057     5058       +1     
============================================
  Files           582      582              
  Lines         15580    15589       +9     
  Branches       1495     1496       +1     
============================================
+ Hits          14029    14037       +8     
  Misses         1096     1096              
- Partials        455      456       +1

Impacted Files	Coverage Δ
.../opentelemetry/exporter/prometheus/Serializer.java	`86.53% <88.88%> (+0.08%)`	:arrow_up:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Jun 29 '22 21:06 codecov[bot]

@anuraaga I'm not sure I'm following - is there something in the spec that is ambiguous about which resource attributes are included? This section feels fairly straight forward.

Additionally, this behavior is purely additive right now. Users / scrapers that don't use the target info can just ignore it so not sure of the downside.

Jun 30 '22 14:06 jack-berg

The spec does not take into consideration the size and cardinality of a typical resource so I think it needs clarification. This is similar to not populating resource in Zipkin, I would expect a general definition of what part of the resource to use for cardinality-sensitive backends.

Even if a user doesn't need target_info, Prometheus will still scrape and save it, possibly causing load problems, won't it?

Jun 30 '22 23:06 anuraaga

Even if a user doesn't need target_info, Prometheus will still scrape and save it

Yes that's true.

But prometheus should already be ascribing unique instance ids to the metrics of each scrape target, which should have a one-to-one relationship with unique sets of resource attributes. I believe that means that including target_info shouldn't increase cardinality beyond what already exists.

Jul 01 '22 01:07 jack-berg

This PR was marked stale due to lack of activity. It will be closed in 14 days.

Jul 15 '22 02:07 github-actions[bot]

Talked about this in spec issue 2640 and I've come away thinking no change is needed to the specification and we should move forward with this PR. The TL;DR is that prometheus query syntax allows you to join the target_info attributes at query time, so this data is valuable. The cardinality concerns we discussed are probably small, and users can use metric_relabel_rules to drop high cardinality attributes if needed. No need to reinvent the wheel with additional sdk prometheus exporter configuration IMO.

Jul 20 '22 16:07 jack-berg

If every user is basically required to use the relabel rules to drop info, it seems to be a guaranteed UX burden. We do know of certain attributes like instance ID and pid that are more or less required to drop - it seems like a new target_info time series on every server restart would not be efficient even if it can be joined (not sure, can it really join across series with label changes?).

As we see, OTel resource does not have a notion of ensuring low cardinality, unlike metric attributes, and in the future high cardinality items may be added at any time. So a deny list is not really safe, and I guess with a relabel config, every user needs to use labelkeep to define an accept list of labels not including the problematic ones. If every single user has to do this (my hypothesis is this is true), then I guess the exporter could just take care of it to make things easier, especially since we're the most familiar with the semantic conventions.

Jul 21 '22 04:07 anuraaga

not sure, can it really join across series with label changes?

I believe yes, but users will likely be joining on instance, so a lot will depend on how they identify the instances they scrape.

The point I disagree on is the likelihood that most users will be impacted by the increased cardinality of this. The max number of timeseries produced by target_info, regardless of how many high cardinality resource attributes are present is 1 per SDK instance. Prometheus can handle a relatively large number of timeseries, and estimates support for ~200k with default modest heap allocation of 2gb. Assuming the user isn't already bumping into limits, let's say they have 10% headroom available, or 20k timeseries to spare. With the default retention of 15 days, they could be running 10 apps, with 10 instances each, and each restarting 13 times a day before they exceed 20k series for target_info. And a 2gb heap prometheus instance is pretty modest resource allocation for an environment with 100 instances (10 apps * 10 instances / app).

So while I can't say with high certainty, my SWAG is that idk, something like less than 2% of users will notice the cardinality of target_info. Seems much more likely they'll first notice cardinality issues from their instruments. And IMO, the potential value they could get with joins to target_info outweighs the reduced UX a small population of users will experience.

I don't feel especially strongly about this, and am happy to let the prometheus wg sort out a decision. Ultimately I think the OTLP exporter is far superior to the prometheus exporter and will always be an advocate for that 😛

Jul 21 '22 15:07 jack-berg

I think I've added all I can so leave you to make the decision :) One final suggestion is it would probably be good to copy what the target_info would look like with the javaagent to Prometheus wg to get their approval, if they approve it should be fine. Note that if it's spring boot the process.command_line will be orders of magnitude shorter than other typical commands that pass a classpath.

Jul 22 '22 01:07 anuraaga

Adding to this, the main thing driving Prometheus memory usage is active series, which is typically series which recieved a sample in the last 2-3hrs. If a series didn't get a sample, it is removed from memory. This means with a 20K series headroom, they need to see ~20K restarts in 3hrs to hit the limit.

I think the Prometheus WG is in consensus to add all resource attributes to target_info. If required, have a denylist of attributes.

copy what the target_info would look like with the javaagent.

I am going to look into this as I am not super familiar with it. Added it to the agenda for the next WG meeting, but will try to resolve it before then.

Jul 22 '22 12:07 gouthamve

This PR was marked stale due to lack of activity. It will be closed in 14 days.

Aug 05 '22 13:08 github-actions[bot]

Closed as inactive. Feel free to reopen if this PR is still being worked on.

Aug 19 '22 14:08 github-actions[bot]

This PR was marked stale due to lack of activity. It will be closed in 14 days.

Sep 02 '22 16:09 github-actions[bot]

Closed as inactive. Feel free to reopen if this PR is still being worked on.

Sep 16 '22 16:09 github-actions[bot]

opentelemetry-java opentelemetry-java copied to clipboard

Serialize resource attributes in target info for PrometheusHttpServer

Codecov Report

opentelemetry-java
opentelemetry-java copied to clipboard