compliance
compliance copied to clipboard
Prometheus is not fully compatible with OpenMetrics tests
What did you do?
We want to ensure OpenMetrics / Prometheus compatibility in the OpenTelemetry Collector. We have been building compatibility tests to verify the OpenMetrics spec is fully supported on both the OpenTelemetry Collector Prometheus receiver and PRW exporter as well as in Prometheus itself.
We used the OpenMetrics metrics test data available at https://github.com/OpenObservability/OpenMetrics/tree/main/tests/testdata/parsers
Out of a total of 161 negative tests in OpenMetrics, 94 tests pass (these tests are dropped) with an 'up' value of 0; 67 tests are not dropped and have an 'up' value of 1 and 22 tests have incorrectly ingested metrics.
In order to test Prometheus itself, we set up a metrics HTTP endpoint that exposes invalid/bad metrics from the OpenMetrics tests. We then configured Prometheus 2.31.0 to scrape the metrics endpoint.
What did you expect to see?
Expected result: The scrape should fail since the target has invalid metric and the appropriate error should be reported.
For e.g with following metric data: bad_counter_values_1 (https://raw.githubusercontent.com/OpenObservability/OpenMetrics/main/tests/testdata/parsers/bad_counter_values_1/metrics)
# TYPE a counter
a_total -1
# EOF
What did you see instead? Under which circumstances?
Current behavior: Scrape is successful. There are multiple bad test cases that are scraped successfully by Prometheus.
For example - Using bad_counter_values_1 (#5 listed below) does not show an error even though it is an negative counter value. According to OpenMetrics tests, this metric should not be parsed.
You can see no error has been reported and the scrape is successful.
Similar to bad_counter_values_1 test case, there are multiple bad test cases where the scrape is successful and metrics are ingested by Prometheus:
- bad_missing_or_extra_commas_0
- bad_metadata_in_wrong_place_1
- bad_counter_values_18
- bad_grouping_or_ordering_9
- bad_counter_values_1
- bad_histograms_2
- bad_counter_values_16
- bad_value_1
- bad_missing_or_extra_commas_2
- bad_invalid_labels_6
- bad_grouping_or_ordering_8
- bad_metadata_in_wrong_place_0
- bad_grouping_or_ordering_10
- bad_grouping_or_ordering_0
- bad_value_2
- bad_metadata_in_wrong_place_2
- bad_text_after_eof_1
- bad_value_3
- bad_counter_values_0
- bad_grouping_or_ordering_3
- bad_histograms_3
- bad_blank_line
Environment
- System information:
Darwin 20.6.0 x86_64
- Prometheus version:
version=2.31.0
- Prometheus configuration file:
global:
scrape_interval: 5s
scrape_configs:
- job_name: "open-metrics-scrape"
static_configs:
- targets: ["localhost:3000"]
cc: @PaurushGarg @mustafain117
Thank you for this detailed report.
The Prometheus openmetrics parser is optimized for performance rather than correctness. It is not "OpenMetrics compatible" in the sense of rejecting every bad exposition format. We should have a fully compliant openmetrics parser in the future (https://github.com/prometheus/common/pull/321) but it is unlikely that we use it in Prometheus itself for scraping. It would be used e.g. in node exporter text-collector, and potentially in promtool check metrics, which are out of the hot paths.
My understanding of the OpenMetrics test suite is that it is made for exposing libraries.
Nice (and unexpected) find, thanks @alolita!
I see several paths forward, will need some debate; I will also add it to the backlog of the dev summits.
Super nice to see more usage of the compliance suite.
Nice (and unexpected) find, thanks @alolita!
This is expected, for the reasons Julien gave. It's designed to accept all valid inputs, and reject what it efficiently can .
Aye; I still didn't actively have this on my radar any more. Subjective unexpectedness, not objective one.
Agreed that emitting is different from parsing, that's one of the foundations of IETF and TCP/IP as per https://en.wikipedia.org/wiki/Robustness_principle
To keep the playing field level, we need to make it clearer what MUST/SHOULD/MAY in the suite; I created a list of all tests for convenience and that needs to be split out into more specific lists.
As suggested by @brian-brazil in our last OpenTelemetry Prometheus Workgroup meeting, tested with Content-Type Header set to application/openmetrics-text.
All positive test cases from OpenMetrics pass with an up metric value of 1.
For negative test cases from OpenMetrics, there are 59 test cases that are scraped successfully and have up metric value of 1:
- bad_clashing_names_0
- bad_clashing_names_1
- bad_clashing_names_2
- bad_counter_values_0
- bad_counter_values_1
- bad_counter_values_10
- bad_counter_values_11
- bad_counter_values_12
- bad_counter_values_13
- bad_counter_values_14
- bad_counter_values_15
- bad_counter_values_16
- bad_counter_values_17
- bad_counter_values_18
- bad_counter_values_19
- bad_counter_values_2
- bad_counter_values_3
- bad_counter_values_5
- bad_counter_values_6
- bad_exemplars_on_unallowed_samples_2
- bad_grouping_or_ordering_0
- bad_grouping_or_ordering_10
- bad_grouping_or_ordering_2
- bad_grouping_or_ordering_3
- bad_grouping_or_ordering_4
- bad_grouping_or_ordering_5
- bad_grouping_or_ordering_6
- bad_grouping_or_ordering_7
- bad_grouping_or_ordering_8
- bad_grouping_or_ordering_9
- bad_histograms_0
- bad_histograms_1
- bad_histograms_2
- bad_histograms_3
- bad_histograms_6
- bad_histograms_7
- bad_histograms_8
- bad_info_and_stateset_values_0
- bad_info_and_stateset_values_1
- bad_metadata_in_wrong_place_0
- bad_metadata_in_wrong_place_1
- bad_metadata_in_wrong_place_2
- bad_missing_or_invalid_labels_for_a_type_1
- bad_missing_or_invalid_labels_for_a_type_3
- bad_missing_or_invalid_labels_for_a_type_4
- bad_missing_or_invalid_labels_for_a_type_6
- bad_missing_or_invalid_labels_for_a_type_7
- bad_repeated_metadata_0
- bad_repeated_metadata_1
- bad_repeated_metadata_3
- bad_stateset_info_values_0
- bad_stateset_info_values_1
- bad_stateset_info_values_2
- bad_stateset_info_values_3
- bad_timestamp_4
- bad_timestamp_5
- bad_timestamp_7
- bad_unit_6
- bad_unit_7
cc: @alolita
From a quick look, bad_timestamp_4, bad_timestamp_5,and bad_timestamp_7 are all within a single line so could be reasonably caught by the parser efficiently. The rest require handling metadata across lines, which would be difficult to code maintainable and efficiently given that the code is shared with the Prometheus text format parser which allows lines in completely random order. If that wasn't an issue, we could probably do sufficient metadata stuff for validation without having to generate a ton of garbage on every scrape.
We discussed this topic yesterday at the dev summit. We intend on making sure that emitting libraries are fully compatible on what they emit, and scraping libraries will be able to me more liberal to ensure good performance.
From the rolling document:
CONSENSUS: As per RFC 761, Postel's law, “be conservative in what you do, be liberal in what you accept from others”. We will make certain that the compliance tests treat all implementations the same way and fairly.
As per previous comment, I am moving this issue to the compliance repository.
We intend on making sure that emitting libraries are fully compatible on what they emit, and scraping libraries will be able to me more liberal to ensure good performance.
For future reference, this is not true anymore with: https://github.com/prometheus/prometheus/issues/11982