telegraf icon indicating copy to clipboard operation
telegraf copied to clipboard

feat(parser.prometheusremotewrite, serializer.prometheusremotewrite): Native histogram support end-to-end

Open Reimirno opened this issue 1 year ago • 3 comments

Summary

Add a configuration field to prometheusremotewrite parser to change its behavior when parsing Prometheus native histogram.

Previously prometheusremotewrite parser parses a native histogram into multiple Telegraf metric (the conversion is not lossless and is irreversible); now it parses it into one single Telegraf metric. Correspondingly, add a code path to prometheusremotewrite serializer to be able to handle this single Telegraf "native histogram" metric and convert back to Prometheus native histogram. NOTE: logic that handles classic histogram is orthogonal and not changed.

This treatment of native histograms is able to preserve its benefits in terms of correctness guarantee (atomicity, no correctness issue due to write batching) and its performance gain (low cardinality, sparse data structure). More importantly, this means Telegraf can supports native histogram end-to-end: ingest a native histogram and write out a native histogram.

Detailed rationale, including a diagram, please see #16120

Test

Previously, using a prometheusremotewrite HTTP listener v2 input to accept prometheus remote write, and a prometheusremotewrite HTTP output, Telegraf ingests prometheus native histogram but writes out several counters, as if this were a classic histogram.

In the query here, a native histogram get translated into several "bucket" metrics (but without _bucket suffix and with a <metric_name>_le tag.

image

Now, after this PR:

This native histogram now gets correctly written out as a native histogram.

image

I can add unit test / integration test when the community reaches some agreement on this implementation.

Checklist

  • [x] No AI generated code was used in this PR

Related issues

resolves #16120

Reimirno avatar Oct 31 '24 22:10 Reimirno

Thanks so much for the pull request! :handshake: :black_nib: Just a reminder that the CLA has not yet been signed, and we'll need it before merging. Please sign the CLA when you get a chance, then post a comment here saying !signed-cla

telegraf-tiger[bot] avatar Oct 31 '24 22:10 telegraf-tiger[bot]

!signed-cla

Reimirno avatar Oct 31 '24 22:10 Reimirno

@Reimirno any update?

srebhan avatar Jan 17 '25 07:01 srebhan

Hello! I am closing this issue due to inactivity. I hope you were able to resolve your problem, if not please try posting this question in our Community Slack or Community Forums or provide additional details in this issue and reqeust that it be re-opened. Thank you!

telegraf-tiger[bot] avatar Jan 31 '25 18:01 telegraf-tiger[bot]

@srebhan I think they replied to aal of your comments. I think they are waiting on you instead.

Hipska avatar Jan 31 '25 19:01 Hipska

@Hipska we had a conversation on Slack that splitting the histogram into fields is the right thing to do...

@Reimirno if you feel like I owe you a reply to anything, please speak up. Otherwise, if you do want to continue to work on this PR, please reopen or tell me to reopen here...

srebhan avatar Feb 04 '25 10:02 srebhan

@srebhan Sorry, I dropped the ball due to some changes in life.

I made a new PR implementing the parser side of stuff for this: https://github.com/influxdata/telegraf/pull/16493, accoridng to what we have discussed.

Future contributors interested in taking this up: feel free to ping me if you have questions.

Reimirno avatar Feb 09 '25 03:02 Reimirno