opentelemetry-python Prometheus exporter undefined behaviour when not all labels are set

Describe your environment

opentelemetry-exporter-prometheus==1.12.0rc1
opentelemetry-sdk==1.19.0
opentelemetry-api==1.19.0
Python 3.11.0

Steps to reproduce

from opentelemetry import metrics

test_counter = metrics.get_meter("module").create_counter('test_counter', description='counter used for testing')

test_counter.add(1, {'location': 'here', 'source': 'external'})
test_counter.add(1, {'source': 'internal', 'reason': 'error'})

What is the expected behavior? I expect labels to be the ones specified. But it seems like it only looks at the last call to counter.add to figure out label names, and then just looks at the position of the labels. I also expect these the be the same metric. (no duplicate HELP, TYPE). This also has other weird behaviour when the dict size is different.

The otlp api for counter says this: Users can provide attributes to associate with the increment value, but it is up to their discretion. Therefore, this API MUST be structured to accept a variable number of attributes, including none. This suggest to me that it should be possible to set only certain labels (as far as I understand prometheus also supports missing labels). If this is not the case it should probably be made possible to define label names when creating a counter so this can at least by static type checked. This would also be helpful if missing labels are supported to prevent typos when specifying labels, but that should be probably a separate issue?

What is the actual behavior?

# HELP test_counter_total counter used for testing
# TYPE test_counter_total counter
test_counter_total{reason="external",source="here"} 1.0
# HELP test_counter_total counter used for testing
# TYPE test_counter_total counter
test_counter_total{reason="error",source="internal"} 1.0

Jul 24 '23 14:07 soundofspace

Thanks for reporting. There is definitely a bug here but I'm not sure the best fix it. We are actually wrapping Prometheus client with a custom collector to avoid reimplementing the text format.

The *MetricFamily constructors expect a sequence of label names, then to have calls to add_metric() with label values. So afaict the prometheus client library doesn't really support this, probably because OM recommends against it.

I expect labels to be the ones specified. But it seems like it only looks at the last call to counter.add to figure out label names, and then just looks at the position of the labels.

OpenTelemetry fully supports this. It looks like a bug in the Prometheus exporter where the label names are being rewritten in a for loop. Draft fix here https://github.com/open-telemetry/opentelemetry-python/pull/3416

I also expect these the be the same metric. (no duplicate HELP, TYPE). This also has other weird behaviour when the dict size is different.

Unfortunately that PR does create duplicates (see the added test case). I don't see a way around it with the Prometheus client unless we build a superset of label names across all points. However that has slightly different semantics.

Aug 24 '23 22:08 aabmass

@soundofspace

(as far as I understand prometheus also supports missing labels).

I don't think the official prometheus client does. It requires specifying all label names at instrument creation and then requires passing all of them every time. For example

import time
from prometheus_client import Counter, start_http_server

c = Counter(
    name="promclient_counter",
    documentation="Inconsistent label names with prometheus client",
    labelnames=("a", "b", "newa", "newb"),
)
c.labels(a="a", b="b").inc(10)
c.labels(newa="newa", newb="newb").inc(10)

fails with

Traceback (most recent call last):
  File "repro.py", line 63, in <module>
    c.labels(a="a", b="b").inc(10)
    ^^^^^^^^^^^^^^^^^^^^^^
  File "venv/lib/python3.11/site-packages/prometheus_client/metrics.py", line 182, in labels
    raise ValueError('Incorrect label names')
ValueError: Incorrect label name

Aug 25 '23 17:08 aabmass

From https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/compatibility/prometheus_and_openmetrics.md#metric-metadata-1:

Prometheus SDK exporters MUST NOT allow duplicate UNIT, HELP, or TYPE comments for the same metric name to be returned in a single scrape of the Prometheus endpoint. Exporters MUST drop entire metrics to prevent conflicting TYPE comments, but SHOULD NOT drop metric points as a result of conflicting UNIT or HELP comments. Instead, all but one of the conflicting UNIT and HELP comments (but not metric points) SHOULD be dropped. If dropping a comment or metric points, the exporter SHOULD warn the user through error logging.

This is something we should be able to fix in the exporter. If its helpful, this is how it was fixed in the go exporter: https://github.com/open-telemetry/opentelemetry-go/pull/3469

Aug 25 '23 17:08 dashpole

Ok, I think we will have to move away from using the prometheus client. I don't see a way to prevent it from generating UNIT, HELP, or TYPE for each Metric family. The Golang client seems to be a bit smarter and dedups them for you.

Aug 25 '23 19:08 aabmass

I wonder if, in the interests of pragmatism, we should merge @aabmass's fix regardless of the spec compliance mentioned by @dashpole?

The Prometheus output breaks the spec with or without the fix, but we can at least benefit from having the labels behave correctly until we can find a way to improve the spec compliance as well.

Dec 17 '23 23:12 sproberts92

I am getting this bug also. I came to try OpenTelemetry BECAUSE the official prometheus_client is inadequate for this scenario but found that OpenTelemetry is similary crippled until this is fixed.

Jan 10 '24 16:01 LeonChadwick

It's even worse then I first reported, it's not only having issues with missing labels, but also empty string labels if order is not the same. This was already something that I did to work around this issue.

Subtle bugs like this are really breaking the user experience, and I'm hoping a workaround or permanent fix for this can be deployed rather quickly as this issue has been open for over 6 months and is quite annoying and difficult to work around.

test_counter.add(1, {'cause': '', 'reason': 'blbal'})
test_counter.add(1, {'reason': 'something', 'cause': 'blbal'})

# HELP test_counter_total counter used for testing
# TYPE test_counter_total counter
test_counter_total{cause="blbal",reason=""} 1.0
# HELP test_counter_total counter used for testing
# TYPE test_counter_total counter
test_counter_total{cause="blbal",reason="something"} 1.0

Feb 19 '24 16:02 soundofspace