opentelemetry-rust Add API support for bound instruments

Fixes # Design discussion issue (if applicable) #1374

Changes

This PR adds API support for bound instruments for Counter and UpDownCounter. Bound instruments are counters which always change the value for a specific and predefined attribute set. The focus on this change is creating the initial API surface for bound instruments, and the mechanisms to allow non-bound instruments to become bound (since there is no direct path from aggregates to counters, it's all through closures).

To facilitate counters creating bound instruments that can refer back to their source aggregates, aggregates return a BoundMeasureGenerator<T> closure that is provided alongside the Measure<T> for the aggregate. This closure can then be invoked in order to return a BoundMeasure<T>, which provides the mechanism to invoke a measure for the cached attribute set.

This PR is not performance focused. It should not cause regressions but using bound instruments from this PR may not cause increased throughput. Performance enhancements that take advantage of bound instruments will come in a follow-up PR once we have an agreed upon API and generation path.

Benchmarks/Stress Tests

Main

Counter_Add_Sorted      time:   [717.94 ns 723.06 ns 729.17 ns]
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

Counter_Add_Unsorted    time:   [711.21 ns 714.89 ns 719.15 ns]
Found 12 outliers among 100 measurements (12.00%)
  8 (8.00%) high mild
  4 (4.00%) high severe

stress test: metrics
Number of threads: 6
Throughput: 6,367,800 iterations/sec
Throughput: 6,581,800 iterations/sec
Throughput: 6,736,800 iterations/sec
Throughput: 6,720,200 iterations/sec
Throughput: 6,804,600 iterations/sec

PR

Counter_Add_Sorted      time:   [712.98 ns 715.33 ns 717.99 ns]
                        change: [-0.7779% +0.3569% +1.5923%] (p = 0.60 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

Counter_Add_Unsorted    time:   [705.13 ns 707.71 ns 710.82 ns]
                        change: [-2.0962% -0.9670% +0.0233%] (p = 0.09 > 0.05)
                        No change in performance detected.
Found 13 outliers among 100 measurements (13.00%)
  5 (5.00%) high mild
  8 (8.00%) high severe

Counter_Bounded         time:   [133.96 ns 134.82 ns 135.86 ns]
Found 12 outliers among 100 measurements (12.00%)
  3 (3.00%) high mild
  9 (9.00%) high severe

stress test: metrics
Number of threads: 6
Throughput: 6,173,200 iterations/sec
Throughput: 6,381,600 iterations/sec
Throughput: 6,250,000 iterations/sec
Throughput: 6,434,000 iterations/sec

stress test: metrics_bounded
Number of threads: 6
Throughput: 6,473,400 iterations/sec
Throughput: 6,459,600 iterations/sec
Throughput: 6,339,200 iterations/sec
Throughput: 6,369,200 iterations/sec

Merge requirement checklist

[X] CONTRIBUTING guidelines followed
[X] Unit tests added/updated (if applicable)
[ ] Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
[ ] Changes in public API reviewed (if applicable)

Dec 14 '23 22:12 KallDrexx

The api will change slightly when/if #1421 get in to take an attribute set instead of a slice of key value pairs, but I didn't want to couple both together. Bounded performance might end up getting a bit better since cloning() becomes cheaper, but performance isn't the point of this PR :)

Dec 14 '23 22:12 KallDrexx

Codecov Report

Attention: 73 lines in your changes are missing coverage. Please review.

Comparison is base (8688e7e) 61.8% compared to head (cd3be86) 62.2%. Report is 4 commits behind head on main.

Files	Patch %	Lines
opentelemetry/src/metrics/noop.rs	0.0%	12 Missing :warning:
...-sdk/src/metrics/internal/exponential_histogram.rs	52.6%	9 Missing :warning:
...pentelemetry-sdk/src/metrics/internal/histogram.rs	0.0%	9 Missing :warning:
...entelemetry-sdk/src/metrics/internal/last_value.rs	0.0%	9 Missing :warning:
opentelemetry-sdk/src/metrics/internal/sum.rs	50.0%	9 Missing :warning:
opentelemetry-sdk/src/metrics/instrument.rs	80.5%	7 Missing :warning:
...lemetry/src/metrics/instruments/up_down_counter.rs	46.1%	7 Missing :warning:
opentelemetry/src/metrics/instruments/counter.rs	60.0%	4 Missing :warning:
...pentelemetry-sdk/src/metrics/internal/aggregate.rs	95.3%	3 Missing :warning:
opentelemetry-sdk/src/metrics/pipeline.rs	89.4%	2 Missing :warning:
... and 2 more

Additional details and impacted files

@@           Coverage Diff           @@
##            main   #1444     +/-   ##
=======================================
+ Coverage   61.8%   62.2%   +0.3%     
=======================================
  Files        144     144             
  Lines      19810   20152    +342     
=======================================
+ Hits       12256   12538    +282     
- Misses      7554    7614     +60

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Dec 14 '23 22:12 codecov[bot]

Like the general direction, kinda surprised not to see much of a performance gain over standard counters. I feel like (not related to this change) we have a lot of Arc spread through the code that maybe we could avoid. A bit of locking our locks so we can lock more locks.

Arc contains no locking. It's an atomic number that tracks how many references there are. When you clone() it increases the atomic count, and when you drop() it decrements the atomic count (and if it's the last reference it drops the underlying value). There's no locking involved in cloning, dropping, or accessing an Arc, and a lock is only involved if you access an internal function of the held type that invokes a lock (e.g. a mutex held by the arc).

The reason this doesn't provide any performance benefits is partially because #1421 is not yet merged. Since we don't have that change in, we must do a hard clone of the attribute set every time a bound measure is called (and for each measure in the ResolvedMeasure. Without #1421 it's cloning the whole vector and doing extra allocations.

There's no performance benefit too because this PR has bound instruments as a thin wrapper around normal measurements. Once this PR is merged, the next PR fully takes advantage of bound instruments by creating an atomic that's incremented by the bound measure function. This allows the BoundCounter.add() calls to increment with zero mutex locking and zero hashmap lookups. That's what this PR is setting the stage for.

Dec 19 '23 14:12 KallDrexx

@cijothomas Can we close this PR if you plan to open a new PR for bound instruments inline with the changes here?

Jul 02 '24 19:07 lalitb

@cijothomas Can we close this PR if you plan to open a new PR for bound instruments inline with the changes here?

Yes closing this. Cross-linking to https://github.com/open-telemetry/opentelemetry-rust/issues/1374, so its is easy to find the work in this PR to adopt.

Jul 02 '24 19:07 cijothomas

opentelemetry-rust opentelemetry-rust copied to clipboard

Add API support for bound instruments

Changes

Benchmarks/Stress Tests

Main

PR

Merge requirement checklist

Codecov Report

opentelemetry-rust
opentelemetry-rust copied to clipboard