opentelemetry-specification icon indicating copy to clipboard operation
opentelemetry-specification copied to clipboard

Metric cardinality limits "overflow attribute set" causes inconsistent attribute keys

Open aabmass opened this issue 1 year ago • 10 comments

The Cardinality Limits section of the Metrics SDK spec says:

An overflow attribute set is defined, containing a single attribute otel.metric.overflow having (boolean) value true, which is used to report a synthetic aggregation of the metric events that could not be independently aggregated because of the limit.

This will cause every metric from the SDK to have an inconsistent set of attribute keys across its streams:

mycounter{a="hello" b="world"} 2
mycounter{a="bar" b="foo"} 3
mycounter{a="" b="foo"} 3
mycounter{otel.metric.overflow="true"} 100

Specifically for Prometheus/OpenMetrics this is a bad practice:

Metrics with the same name for a given MetricFamily SHOULD have the same set of label names in their LabelSet.

One solution is to add the overflow label to every metric:

mycounter{a="hello" b="world" otel_metric_overflow="false"} 2
mycounter{a="bar" b="foo" otel_metric_overflow="false"} 3
mycounter{a="" b="foo" otel_metric_overflow="false"} 3
mycounter{a="" b="" otel_metric_overflow="true"} 100

IMO this is pretty clunky. The user also has to be careful when grouping by a in this example to explicitly filter otel.metric.overflow=false or the a="" stream will be grouped with the overflow stream containing other values for a.

aabmass avatar Jun 29 '23 18:06 aabmass