opentelemetry-cpp icon indicating copy to clipboard operation
opentelemetry-cpp copied to clipboard

Histograms not reporting the same set of bounds passed in a view

Open timwoj opened this issue 9 months ago • 8 comments

Describe your environment otel-cpp: v1.12.0 Platform: macOS (but happens on others as well)

Steps to reproduce

  • Create a histogram instrument
  • Add a histogram aggregation view to the meter provider with a different set of bounds
  • Retrieve the data for the histogram from a MetricReader

What is the expected behavior? The bounds values and how the values are inserted into the buckets match the bounds that were set via the view

What is the actual behavior? The bounds values match the default set of bounds, seemingly ignoring the view.

timwoj avatar Nov 07 '23 20:11 timwoj

@timwoj I tested this for OTLP HTTP exporter, and it seems to be working. Attaching the modified example/otlp/http_metric_main.cc for your reference.

http_metric_main.cc.txt

Also, there is unit test to validate this - https://github.com/open-telemetry/opentelemetry-cpp/blob/e8afbb8eac5bd2abb96643c36bef5818a416dbea/sdk/test/metrics/histogram_test.cc#L78

Or else, can you share more details perhaps, the code snippet to reproduce the issue?

lalitb avatar Dec 21 '23 22:12 lalitb

I can link you to the code in our project but it's pretty ingrained into all of it so it might be hard to follow. I'll see about pulling out a minimum reproducer. I reworked everything to follow closely to what the test is doing, but it's still giving me the same results. The only major difference was that the view in the test is created before the instrument, whereas I had it the other way around. Switching it didn't change anything though.

timwoj avatar Jan 02 '24 22:01 timwoj

Thanks, a minimal reproducer would be helpful here :)

lalitb avatar Jan 03 '24 01:01 lalitb

Here's a minimal reproducer. It's likely that I'm just holding it wrong here, but I definitely have no idea how.

test.cc.txt

Running that results in:

% ./a.out
0.000000
5.000000
10.000000
25.000000
50.000000
75.000000
100.000000
250.000000
500.000000
750.000000
1000.000000
2500.000000
5000.000000
7500.000000
10000.000000

timwoj avatar Jan 03 '24 19:01 timwoj

@lalitb Any further ideas here?

timwoj avatar Jan 18 '24 20:01 timwoj

I had a few minutes to look at this again and compare it to the test case. It appears that if you make the instrument after the view, everything works fine. If you make the instrument first, it doesn't pick up the correct set of bounds.

timwoj avatar Feb 12 '24 03:02 timwoj

@timwoj Apologies for not checking this, as I have been focused on other stuff. Your analysis is correct, and this is not just for bounds, but for all other view configurations. They are only valid for the instruments created after the view. Probably, this needs to be documented better.

lalitb avatar Feb 12 '24 05:02 lalitb

Unfortunately that's how we're already doing it (view before instrument) in the Zeek code, and it's still not functioning correctly. Is there any easy way to track this down?

timwoj avatar Feb 13 '24 15:02 timwoj

I finally figured out what this was, after much staring at debugger output. I was passing the wrong prefix name to the MeterSelector constructor for the metric when I created the view. This caused the instrument to appear in the ResourceMetrics during the reader's Collect method, but not the view, so it was returning the wrong data. I was under the impression that the prefix you pass into that constructor should be the same for all metrics in the entire program (including ones with different prefixes), but that's apparently incorrect.

timwoj avatar Feb 21 '24 00:02 timwoj

I'll go ahead and close this since it's not a bug in otel-cpp, but in my understanding of how things work.

timwoj avatar Feb 21 '24 00:02 timwoj