opentelemetry-go Investigate if attribute filtering should be in the instrument or aggregator

Investigate which is more optimal

Filter attributes at the instrument level when a "record" operation is called.
Filter attributes at the aggregator level when an Aggregation collection is called.

Ensure we implement the optimal one.

Jul 12 '22 17:07 MrAlias

IMO filtering is better done in the aggregator, since the cost will be once per export interval instead of once per observation.

Jul 18 '22 21:07 jmacd

Initial benchmarking for the last-value aggregation (the simplistic one) shows the expected performance differences.

Benchmark:

func benchmarkFiltered[N int64 | float64](factory func(attribute.Filter) (Measure[N], ComputeAggregation)) func(*testing.B) {
	nAttr := []int{1, 10, 100}       // Number of distinct attribute sets.
	nMeas := []int{1, 10, 100, 1000} // Number of measurements made per attribute set.
	return func(b *testing.B) {
		for _, attributeCap := range nAttr {
			for _, measurements := range nMeas {
				name := fmt.Sprintf("Attributes/%d/Measurements/%d", attributeCap, measurements)
				b.Run(name, func(b *testing.B) {
					attrs := make([]attribute.Set, attributeCap)
					for i := range attrs {
						attrs[i] = attribute.NewSet(
							userAlice,
							attribute.Int("value", i),
						)
					}

					got := &bmarkRes
					ctx := context.Background()
					meas, comp := factory(attrFltr)

					b.ReportAllocs()
					b.ResetTimer()
					for n := 0; n < b.N; n++ {
						for m := 0; m < measurements; m++ {
							for _, attr := range attrs {
								meas(ctx, 1, attr)
							}
						}

						assert.Equal(b, 1, comp(got), "attributes not filtered")
					}
				})
			}
		}
	}
}

$ go test -run='^$' -bench=LastValue/Filtered/Int64 -count=10 > old.txt  # run on main
...
$ go test -run='^$' -bench=LastValue/Filtered/Int64 -count=10 > new.txt  # run on test branch
...
$ benchstat old.txt new.txt
goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/sdk/metric/internal/aggregate
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
                                                            │   old.txt   │                new.txt                │
                                                            │   sec/op    │    sec/op     vs base                 │
LastValue/Filtered/Int64/Attributes/1/Measurements/1-8        1.154µ ± 3%    1.779µ ± 2%   +54.23% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/10-8       4.776µ ± 3%    3.772µ ± 1%   -21.03% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/100-8      40.46µ ± 4%    23.15µ ± 2%   -42.79% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/1000-8     398.4µ ± 3%    218.9µ ± 6%   -45.06% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/1-8       4.836µ ± 2%   10.019µ ± 2%  +107.18% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/10-8      40.57µ ± 3%    30.68µ ± 1%   -24.39% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/100-8     406.0µ ± 3%    240.3µ ± 1%   -40.82% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/1000-8    4.088m ± 3%    2.349m ± 1%   -42.55% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/1-8      41.33µ ± 2%    87.16µ ± 1%  +110.90% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/10-8     411.0µ ± 3%    291.8µ ± 1%   -29.00% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/100-8    4.058m ± 3%    2.417m ± 1%   -40.42% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/1000-8   40.43m ± 2%    22.82m ± 6%   -43.56% (p=0.000 n=10)
geomean                                                       144.2µ         119.3µ        -17.28%

                                                            │     old.txt     │               new.txt                │
                                                            │      B/op       │     B/op      vs base                │
LastValue/Filtered/Int64/Attributes/1/Measurements/1-8             216.0 ± 0%     328.0 ± 0%  +51.85% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/10-8           1944.0 ± 0%     327.0 ± 0%  -83.18% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/100-8         19224.0 ± 0%     328.0 ± 0%  -98.29% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/1000-8       192024.0 ± 0%     327.5 ± 0%  -99.83% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/1-8          1.898Ki ± 0%   2.057Ki ± 0%   +8.33% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/10-8        18.773Ki ± 0%   2.056Ki ± 0%  -89.05% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/100-8      187.523Ki ± 0%   2.055Ki ± 0%  -98.90% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/1000-8    1875.026Ki ± 0%   2.058Ki ± 1%  -99.89% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/1-8         18.77Ki ± 0%   18.93Ki ± 0%   +0.83% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/10-8       187.52Ki ± 0%   18.93Ki ± 0%  -89.90% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/100-8     1875.03Ki ± 0%   18.97Ki ± 0%  -98.99% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/1000-8   18750.06Ki ± 0%   19.43Ki ± 0%  -99.90% (p=0.000 n=10)
geomean                                                          60.02Ki        2.323Ki       -96.13%

                                                            │    old.txt    │              new.txt               │
                                                            │   allocs/op   │ allocs/op   vs base                │
LastValue/Filtered/Int64/Attributes/1/Measurements/1-8           3.000 ± 0%   4.000 ± 0%  +33.33% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/10-8         21.000 ± 0%   4.000 ± 0%  -80.95% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/100-8       201.000 ± 0%   4.000 ± 0%  -98.01% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/1000-8     2001.000 ± 0%   4.000 ± 0%  -99.80% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/1-8          21.00 ± 0%   22.00 ± 0%   +4.76% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/10-8        201.00 ± 0%   22.00 ± 0%  -89.05% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/100-8      2001.00 ± 0%   22.00 ± 0%  -98.90% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/1000-8    20001.00 ± 0%   22.00 ± 0%  -99.89% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/1-8         201.0 ± 0%   203.0 ± 0%   +1.00% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/10-8       2001.0 ± 0%   203.0 ± 0%  -89.86% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/100-8     20001.0 ± 0%   203.0 ± 0%  -98.99% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/1000-8   200001.0 ± 0%   203.0 ± 0%  -99.90% (p=0.000 n=10)
geomean                                                          660.4        26.14       -96.04%

In the trivial case, where there is one measurement per distinct attribute set, there is an decrease in CPU performance and an increase in memory use. This makes sense as the backing array for the aggregate will be the performance bottle-neck, not the filtering computation.

However, for more realistic workloads, where there are many measurements for distinct attribute sets, the CPU performance increased and memory use decreased. Importantly though, the allocation scaled by O(N) for N being the number of distinct attributes instead of O(N+M) for M being the number of measurements made in a collection. This allocation detail is not reflected in the CPU performance, but will be a major factor in real world scenarios based on the GC pressure that will be removed (in all but the trivial case).

This initial testing shows this change should continue to be pursued.

Dec 18 '23 19:12 MrAlias

Cardinality limiting is going to complicate this. Limiting is currently done on the measurement of values. However, with this filtering being done on collection the current limiting will "over limit".

Dec 19 '23 17:12 MrAlias

Cardinality limiting is going to complicate this. Limiting is currently done on the measurement of values. However, with this filtering being done on collection the current limiting will "over limit".

This interaction needs to be brought to the specification prior to the cardinality limit being stabilized.

Dec 19 '23 19:12 MrAlias

Cardinality limiting is going to complicate this. Limiting is currently done on the measurement of values. However, with this filtering being done on collection the current limiting will "over limit".

This interaction needs to be brought to the specification prior to the cardinality limit being stabilized.

https://github.com/open-telemetry/opentelemetry-specification/issues/3798

Dec 19 '23 21:12 MrAlias

Moving out of the post-GA project. There is not clear consensus on how to resolve https://github.com/open-telemetry/opentelemetry-specification/pull/3803.

Give the solution to this will require inconsistent attribute filter values in favor of performance there needs to be a strong user desire to see this before it warrants the developer commitment.

Jan 29 '24 20:01 MrAlias

opentelemetry-go opentelemetry-go copied to clipboard

Investigate if attribute filtering should be in the instrument or aggregator

opentelemetry-go
opentelemetry-go copied to clipboard