envoy
envoy copied to clipboard
Redis endpoints success/error/timeout stats are zero
Title: Redis endpoints success/error/timeout stats are zero
Description:
There is new feature in v1.29.0 - per_endpoint_stats.
I've enabled it with redis upstreams and see envoy_cluster_endpoint_rq_total increasing, but envoy_cluster_endpoint_rq_success, envoy_cluster_endpoint_rq_error and envoy_cluster_endpoint_rq_timeout are always zero.
/clusters endpoint also shows zero values for these stats.
We're currently using envoy only with redis upstreams, so I don't know if is only redis related or not.
What issue is being seen? Describe what should be happening instead of
envoy_cluster_endpoint_rq_success, envoy_cluster_endpoint_rq_error and envoy_cluster_endpoint_rq_timeout counters increasing accordingly.
Repro steps: Enable in cluster with redis upstreams:
track_cluster_stats:
per_endpoint_stats: true
Make some requests.
Observe only envoy_cluster_endpoint_rq_total increasing, but not the other rq stats.
Admin and Stats Output:
/stats: https://gist.github.com/zigmund/642b32394ba87612188e9eff73c605b1
/clusters: https://gist.github.com/zigmund/9eec9c17e933fa52299539dede1f8be5
/routes: 404 Not Found
/server_info: https://gist.github.com/zigmund/df51bd50b7a9267c7a4b2ac82712e70c
Config: https://gist.github.com/zigmund/b0194f01acb427933f8c7e9d8b3b9720
Logs: https://gist.github.com/zigmund/62063e1dff35af0afff585ea481c04cc
Probably just not implemented for redis.
true, this doesn't look related to per_cluster_stats.
Even without that, I see the same issue:
redis_cluster::127.0.0.1:6379::cx_active::1
redis_cluster::127.0.0.1:6379::cx_connect_fail::0
redis_cluster::127.0.0.1:6379::cx_total::1
redis_cluster::127.0.0.1:6379::rq_active::0
redis_cluster::127.0.0.1:6379::rq_error::0
redis_cluster::127.0.0.1:6379::rq_success::0
redis_cluster::127.0.0.1:6379::rq_timeout::0
redis_cluster::127.0.0.1:6379::rq_total::3
Can I work on this? I see it marked as help wanted
The referenced PR per_endpoint_stats has marked #21685 as closed. In that issue, @vandyvilla mentioned that they needed per-upstream-host stats specifically for Redis cluster. How could it be that this isn't implemented for Redis yet if the referenced PR closed a bug requesting these stats specifically for Redis? What outstanding work is projected to be required?
Can I work on this? I see it marked as help wanted
I think you can @pratyushprakash 😄