feat(parser.prometheusremotewrite): Add dense metric version to better support histograms
Summary
Split v1/v2 metric parser version for prometheusremotewrite format, emulating that in prometheus format.
V2 is the original, existing behavior.Code is directly lifted from the original plugins/parsers/prometheusremotewrite/parser.go.
This PR implements V1, which emulates the dense behavior of prometheus format v1. It:
- converts prom metric name to measurement name
- uses a generic key
valuefor value field - converts a prom native histogram into one single telegraf metric, instead of multiple (as in V2)
Note that prometheusremotewrite V1 DOES NOT convert classic histogram/summary into one single telegraf metric as in the case of prometheus V1, because in a remote write request, a classic histogram is already broken up into various sub-metrics (counters/gauges), and they could even be in different remote write batches. However, it does conserves the dense behavior for native histogram.
--
This PR partially inplemented the design in #16120 and partially supercedes PR #16121, as a follow up to resolve this comment and design discussion in slack channel with @srebhan.
A PR for prometheusremotewrite serializer can be found at #16494
Checklist
- [x] No AI generated code was used in this PR
Related issues
resolves #16120
Thanks for the helpful comments! I will iterate on them.
I think I addressed all the comments. Feel free to reopen if I left out anything.
Some comments on the folder-based test cases:
- input file format: Prom RW with native histogram needs to be in protobuf format and I don't find an easy and readable way to provide that kind of input as a file for the test input to consume. (I went down the prototext rabbit hole only to find there're some version mismatch of the proto unmarshall librarys - remember prometheus is using quite a unique proto library) So I just opted for json files for input and parse json into a prometheus remote write request struct. (This is also why I end up not using
"github.com/influxdata/telegraf/testutil/plugin_input" - As for output, it could be json or influx lines or whatever. It doesn't matter. Used json just for consistency.
Some comments on the folder-based test cases:
- input file format: Prom RW with native histogram needs to be in protobuf format and I don't find an easy and readable way to provide that kind of input as a file for the test input to consume. (I went down the prototext rabbit hole only to find there're some version mismatch of the proto unmarshall librarys - remember prometheus is using quite a unique proto library) So I just opted for json files for input and parse json into a prometheus remote write request struct. (This is also why I end up not using
"github.com/influxdata/telegraf/testutil/plugin_input"
You can usually also encode the data as JSON (see e.g. the GNMI unit-tests). If that's not an option, the data can be stored as binary data in the testcase directories...
- As for output, it could be json or influx lines or whatever. It doesn't matter. Used json just for consistency.
The GNMI test linked above might be a nice example of how to use the testutil machinery to ease testing. The big benefit of folder-based test-cases is that you can easily add tests with data provided by the user. In e.g. GNMI we do have an option to output the raw (protobuf) input data making reproducing issues a copy-and-paste operation.
Thank you so much for the feedback; I would take a look soon and iterate on this again.
By using
-
jsonpbto parse json input files -
influx.Parserto parse influxdb line format expected output files
We are able to greatly simplify the test setup logic. Thanks for your wonderful suggestions!
I think the broken integration test (on output.sql) is unrelated to my changes and already fixed in master by https://github.com/influxdata/telegraf/commit/edb28dea0715ab20c6f32ccd954db8601ba82c7a
should I rebase or do we don't care?
Download PR build artifacts for linux_amd64.tar.gz, darwin_arm64.tar.gz, and windows_amd64.zip. Downloads for additional architectures and packages are available below.
:partying_face: This pull request decreases the Telegraf binary size by -2.69 % for linux amd64 (new size: 284.0 MB, nightly size 291.8 MB)
:package: Click here to get additional PR build artifacts
Artifact URLs
@Reimirno just ignore this for now, we know it's fixed in master...