ratelimit icon indicating copy to clipboard operation
ratelimit copied to clipboard

[Proposal] detailed metric mode, for value in descriptor KEY_VALUE pair, if value not explicitly defined/matched in config

Open wwillsey opened this issue 5 years ago • 3 comments

In order to gain greater observability of the operations of the rate limit service, a "detailed" metric mode that exposes metrics on the descriptor KEY_VALUE pairs in all cases, not only when the value is explicitly configured in the config. These values would allow for substantially more granular alerts / downstream reporting based on metrics emitted by the process when a descriptor key is matched, but its value is not.

While there's an understandable the concern over metric cardinality when including the values in the metrics, it is often the case that the possible values are well understood, and limited such that including them would pose little concern.

wwillsey avatar Oct 19 '20 20:10 wwillsey

Seems like a practical compromise might be to have a fixed-sized buffer for each config that can hold a sample of real-world key/value pairs that were matched during particular time buckets, or what OpenCensus would call "exemplars" https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/Exemplars.md (also demonstrated in https://youtu.be/U72b4Nl0Ftw?t=1300)

dweitzman avatar Oct 19 '20 21:10 dweitzman

The exemplars approach sounds interesting. I would also be fine with an "all-metrics" mode as long as it is opt-in.

mattklein123 avatar Oct 19 '20 23:10 mattklein123

Might be an unrelated question, but I wonder how do people track things like latency/QPS/error rate on API (ShouldRateLimit) level?

lmajercak-wish avatar Mar 10 '21 18:03 lmajercak-wish

I also have an interest in this and created an issue quite a while ago that turned "stale". https://github.com/envoyproxy/ratelimit/issues/311. I have now created an implementation of this, that would at least match our needs.

I added a key to the configuration called "include_value_in_metric_when_not_specified"

which will override the default behavior and add the values to metrics.

I can create a PR attached to this issue and you can have a look PR-submitted https://github.com/envoyproxy/ratelimit/pull/389

jespersoderlund avatar Jan 08 '23 11:01 jespersoderlund