MLServer icon indicating copy to clipboard operation
MLServer copied to clipboard

REST latency buckets capped at 10s

Open adriangonz opened this issue 2 years ago • 0 comments

The starlette_exporter middleware we use in the REST server seems to cap buckets at 10s (which matches with the default buckets used in Prom: https://github.com/prometheus/client_python/blob/4f994ece6dcfd1905726d18e2a6899cc4474ac3d/prometheus_client/metrics.py#L544).

In some cases, requests' latency may exceed this threshold. To account for this we should either re-distribute the buckets or let the user configure them.

adriangonz avatar Mar 06 '23 09:03 adriangonz