tally icon indicating copy to clipboard operation
tally copied to clipboard

Reporting Histogram Sum Metric

Open shivajividhale opened this issue 2 years ago • 2 comments

Hey folks,

I'm trying to get some clarification on Histogram reporting for prometheus metrics. A sample histogram metric in Prometheus format looks like

# HELP http_request_duration_seconds Api requests response time in seconds
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_sum{api="add_product" instance="host1.domain.com"} 8953.332
http_request_duration_seconds_count{api="add_product" instance="host1.domain.com"} 27892
http_request_duration_seconds_bucket{api="add_product" instance="host1.domain.com" le="0"}
http_request_duration_seconds_bucket{api="add_product", instance="host1.domain.com", le="0.01"} 0
http_request_duration_seconds_bucket{api="add_product", instance="host1.domain.com", le="0.025"} 8
http_request_duration_seconds_bucket{api="add_product", instance="host1.domain.com", le="0.05"} 1672
http_request_duration_seconds_bucket{api="add_product", instance="host1.domain.com", le="0.1"} 8954
http_request_duration_seconds_bucket{api="add_product", instance="host1.domain.com", le="0.25"} 14251
http_request_duration_seconds_bucket{api="add_product", instance="host1.domain.com", le="0.5"} 24101
http_request_duration_seconds_bucket{api="add_product", instance="host1.domain.com", le="1"} 26351
http_request_duration_seconds_bucket{api="add_product", instance="host1.domain.com", le="2.5"} 27534
http_request_duration_seconds_bucket{api="add_product", instance="host1.domain.com", le="5"} 27814
http_request_duration_seconds_bucket{api="add_product", instance="host1.domain.com", le="10"} 27881
http_request_duration_seconds_bucket{api="add_product", instance="host1.domain.com", le="25"} 27890
http_request_duration_seconds_bucket{api="add_product", instance="host1.domain.com", le="+Inf"} 27892

From our exercise, we are seeing that <histogram_metric>_sum metric is being ingested but not published/reported using the Histogram struct. As a result, we are not able to calculate averages.

If I'm not mistaken, this suggests scope for a patch to support _sum metric (http_request_duration_seconds_sum from the above example). And an addition of a field such as sum float64 to the above struct.

To avoid introducing breaking changes, can we add another method func (h *histogram) RecordSum(sum float64) which will allow us to update the Histogram struct.

Please comment on the reasonability of this issue. And if there's something that we are missing. If this seems reasonable, my team will be happy to contribute with a patch upstream.

Thanks

shivajividhale avatar Feb 07 '23 00:02 shivajividhale