compliance Prometheus is not fully compatible with OpenMetrics tests

What did you do?

We want to ensure OpenMetrics / Prometheus compatibility in the OpenTelemetry Collector. We have been building compatibility tests to verify the OpenMetrics spec is fully supported on both the OpenTelemetry Collector Prometheus receiver and PRW exporter as well as in Prometheus itself.

We used the OpenMetrics metrics test data available at https://github.com/OpenObservability/OpenMetrics/tree/main/tests/testdata/parsers

Out of a total of 161 negative tests in OpenMetrics, 94 tests pass (these tests are dropped) with an 'up' value of 0; 67 tests are not dropped and have an 'up' value of 1 and 22 tests have incorrectly ingested metrics.

In order to test Prometheus itself, we set up a metrics HTTP endpoint that exposes invalid/bad metrics from the OpenMetrics tests. We then configured Prometheus 2.31.0 to scrape the metrics endpoint.

What did you expect to see?

Expected result: The scrape should fail since the target has invalid metric and the appropriate error should be reported.

For e.g with following metric data: bad_counter_values_1 (https://raw.githubusercontent.com/OpenObservability/OpenMetrics/main/tests/testdata/parsers/bad_counter_values_1/metrics)

# TYPE a counter
a_total -1
# EOF

What did you see instead? Under which circumstances?

Current behavior: Scrape is successful. There are multiple bad test cases that are scraped successfully by Prometheus.

For example - Using bad_counter_values_1 (#5 listed below) does not show an error even though it is an negative counter value. According to OpenMetrics tests, this metric should not be parsed.

You can see no error has been reported and the scrape is successful.

Similar to bad_counter_values_1 test case, there are multiple bad test cases where the scrape is successful and metrics are ingested by Prometheus:

bad_missing_or_extra_commas_0
bad_metadata_in_wrong_place_1
bad_counter_values_18
bad_grouping_or_ordering_9
bad_counter_values_1
bad_histograms_2
bad_counter_values_16
bad_value_1
bad_missing_or_extra_commas_2
bad_invalid_labels_6
bad_grouping_or_ordering_8
bad_metadata_in_wrong_place_0
bad_grouping_or_ordering_10
bad_grouping_or_ordering_0
bad_value_2
bad_metadata_in_wrong_place_2
bad_text_after_eof_1
bad_value_3
bad_counter_values_0
bad_grouping_or_ordering_3
bad_histograms_3
bad_blank_line

Environment

System information:

Darwin 20.6.0 x86_64

Prometheus version:

version=2.31.0

Prometheus configuration file:

global:
  scrape_interval: 5s

scrape_configs:
  - job_name: "open-metrics-scrape"
    static_configs:
      - targets: ["localhost:3000"]

cc: @PaurushGarg @mustafain117

Nov 08 '21 22:11 alolita

Thank you for this detailed report.

The Prometheus openmetrics parser is optimized for performance rather than correctness. It is not "OpenMetrics compatible" in the sense of rejecting every bad exposition format. We should have a fully compliant openmetrics parser in the future (https://github.com/prometheus/common/pull/321) but it is unlikely that we use it in Prometheus itself for scraping. It would be used e.g. in node exporter text-collector, and potentially in promtool check metrics, which are out of the hot paths.

My understanding of the OpenMetrics test suite is that it is made for exposing libraries.

Nov 08 '21 22:11 roidelapluie

Nice (and unexpected) find, thanks @alolita!

I see several paths forward, will need some debate; I will also add it to the backlog of the dev summits.

Nov 10 '21 12:11 RichiH

Super nice to see more usage of the compliance suite.

Nov 10 '21 12:11 RichiH

Nice (and unexpected) find, thanks @alolita!

This is expected, for the reasons Julien gave. It's designed to accept all valid inputs, and reject what it efficiently can .

Nov 10 '21 13:11 brian-brazil

Aye; I still didn't actively have this on my radar any more. Subjective unexpectedness, not objective one.

Agreed that emitting is different from parsing, that's one of the foundations of IETF and TCP/IP as per https://en.wikipedia.org/wiki/Robustness_principle

To keep the playing field level, we need to make it clearer what MUST/SHOULD/MAY in the suite; I created a list of all tests for convenience and that needs to be split out into more specific lists.

Nov 10 '21 14:11 RichiH

As suggested by @brian-brazil in our last OpenTelemetry Prometheus Workgroup meeting, tested with Content-Type Header set to application/openmetrics-text.

All positive test cases from OpenMetrics pass with an up metric value of 1.

For negative test cases from OpenMetrics, there are 59 test cases that are scraped successfully and have up metric value of 1:

bad_clashing_names_0
bad_clashing_names_1
bad_clashing_names_2
bad_counter_values_0
bad_counter_values_1
bad_counter_values_10
bad_counter_values_11
bad_counter_values_12
bad_counter_values_13
bad_counter_values_14
bad_counter_values_15
bad_counter_values_16
bad_counter_values_17
bad_counter_values_18
bad_counter_values_19
bad_counter_values_2
bad_counter_values_3
bad_counter_values_5
bad_counter_values_6
bad_exemplars_on_unallowed_samples_2
bad_grouping_or_ordering_0
bad_grouping_or_ordering_10
bad_grouping_or_ordering_2
bad_grouping_or_ordering_3
bad_grouping_or_ordering_4
bad_grouping_or_ordering_5
bad_grouping_or_ordering_6
bad_grouping_or_ordering_7
bad_grouping_or_ordering_8
bad_grouping_or_ordering_9
bad_histograms_0
bad_histograms_1
bad_histograms_2
bad_histograms_3
bad_histograms_6
bad_histograms_7
bad_histograms_8
bad_info_and_stateset_values_0
bad_info_and_stateset_values_1
bad_metadata_in_wrong_place_0
bad_metadata_in_wrong_place_1
bad_metadata_in_wrong_place_2
bad_missing_or_invalid_labels_for_a_type_1
bad_missing_or_invalid_labels_for_a_type_3
bad_missing_or_invalid_labels_for_a_type_4
bad_missing_or_invalid_labels_for_a_type_6
bad_missing_or_invalid_labels_for_a_type_7
bad_repeated_metadata_0
bad_repeated_metadata_1
bad_repeated_metadata_3
bad_stateset_info_values_0
bad_stateset_info_values_1
bad_stateset_info_values_2
bad_stateset_info_values_3
bad_timestamp_4
bad_timestamp_5
bad_timestamp_7
bad_unit_6
bad_unit_7

cc: @alolita

Nov 12 '21 19:11 mustafain117

From a quick look, bad_timestamp_4, bad_timestamp_5,and bad_timestamp_7 are all within a single line so could be reasonably caught by the parser efficiently. The rest require handling metadata across lines, which would be difficult to code maintainable and efficiently given that the code is shared with the Prometheus text format parser which allows lines in completely random order. If that wasn't an issue, we could probably do sufficient metadata stuff for validation without having to generate a ton of garbage on every scrape.

Nov 12 '21 19:11 brian-brazil

We discussed this topic yesterday at the dev summit. We intend on making sure that emitting libraries are fully compatible on what they emit, and scraping libraries will be able to me more liberal to ensure good performance.

From the rolling document:

CONSENSUS: As per RFC 761, Postel's law, “be conservative in what you do, be liberal in what you accept from others”. We will make certain that the compliance tests treat all implementations the same way and fairly.

Nov 19 '21 16:11 roidelapluie

As per previous comment, I am moving this issue to the compliance repository.

Nov 19 '21 16:11 roidelapluie

We intend on making sure that emitting libraries are fully compatible on what they emit, and scraping libraries will be able to me more liberal to ensure good performance.

For future reference, this is not true anymore with: https://github.com/prometheus/prometheus/issues/11982

Apr 24 '23 14:04 gouthamve

compliance compliance copied to clipboard

Prometheus is not fully compatible with OpenMetrics tests

compliance
compliance copied to clipboard