bazel-buildfarm icon indicating copy to clipboard operation
bazel-buildfarm copied to clipboard

Configurable bucket for histogram metrics

Open chestnutprog opened this issue 2 years ago • 3 comments
trafficstars

All buildfarm histogram metrics except grpclatency is not configurable, and some of them have improper default value, especially time related metrics on large repo (execution_time_ms,input_fetch_time_ms,queue_time_ms...) with too small bucket (the largest is [10ms, +inf]).

Can we make those bucket configurable or set some more proper default vaules? Thanks.

chestnutprog avatar Jul 04 '23 08:07 chestnutprog

It might be sufficient to update the prometheus client dependency from using io.prometheus:simpleclient:0.10.0 to io.prometheus:prometheus-metrics-core:1.0.0 which now supports prometheus native histograms ( see http://prometheus.github.io/client_java/getting-started/metric-types/#histogram ) That does have the caveat that the prometheus server run with --enable-feature=native-histograms -- so there might be value in implementing custom buckets for classic histograms too.

Note upgrading to the 1.0.0 prometheus client would also require updating/patching me.dinowernli:java-grpc-prometheus as it uses simpleclient as well to register its metrics if grpc metrics are enabled -- it would at least need to rebuild with the new compatability layer.

tgerdes-cohesity avatar Oct 26 '23 05:10 tgerdes-cohesity

Can you try this? This should resolve your issue. https://github.com/bazelbuild/bazel-buildfarm/pull/1376

amishra-u avatar Nov 02 '23 03:11 amishra-u

#1376 seems related, but only addresses the latency histograms created by the me.dinowernli:java-grpc-prometheus package.

This report specifically mentions the other metrics such as execution_time_ms , input_fetch_time_ms, and queue_time_ms, among others. None of these call .buckets() to override the default buckets of .005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, and 10.

There is one example of a Histogram overriding the default buckets in cas_ttl_s but that one is not configurable.

tgerdes-cohesity avatar Nov 14 '23 23:11 tgerdes-cohesity