envoy icon indicating copy to clipboard operation
envoy copied to clipboard

Redis endpoints success/error/timeout stats are zero

Open zigmund opened this issue 1 year ago • 5 comments

Title: Redis endpoints success/error/timeout stats are zero

Description: There is new feature in v1.29.0 - per_endpoint_stats. I've enabled it with redis upstreams and see envoy_cluster_endpoint_rq_total increasing, but envoy_cluster_endpoint_rq_success, envoy_cluster_endpoint_rq_error and envoy_cluster_endpoint_rq_timeout are always zero. /clusters endpoint also shows zero values for these stats.

image

We're currently using envoy only with redis upstreams, so I don't know if is only redis related or not.

What issue is being seen? Describe what should be happening instead of

envoy_cluster_endpoint_rq_success, envoy_cluster_endpoint_rq_error and envoy_cluster_endpoint_rq_timeout counters increasing accordingly.

Repro steps: Enable in cluster with redis upstreams:

          track_cluster_stats:
            per_endpoint_stats: true

Make some requests. Observe only envoy_cluster_endpoint_rq_total increasing, but not the other rq stats.

Admin and Stats Output: /stats: https://gist.github.com/zigmund/642b32394ba87612188e9eff73c605b1 /clusters: https://gist.github.com/zigmund/9eec9c17e933fa52299539dede1f8be5 /routes: 404 Not Found /server_info: https://gist.github.com/zigmund/df51bd50b7a9267c7a4b2ac82712e70c

Config: https://gist.github.com/zigmund/b0194f01acb427933f8c7e9d8b3b9720

Logs: https://gist.github.com/zigmund/62063e1dff35af0afff585ea481c04cc

zigmund avatar Mar 01 '24 06:03 zigmund

Probably just not implemented for redis.

mattklein123 avatar Mar 01 '24 14:03 mattklein123

true, this doesn't look related to per_cluster_stats.

Even without that, I see the same issue:

redis_cluster::127.0.0.1:6379::cx_active::1
redis_cluster::127.0.0.1:6379::cx_connect_fail::0
redis_cluster::127.0.0.1:6379::cx_total::1
redis_cluster::127.0.0.1:6379::rq_active::0
redis_cluster::127.0.0.1:6379::rq_error::0
redis_cluster::127.0.0.1:6379::rq_success::0
redis_cluster::127.0.0.1:6379::rq_timeout::0
redis_cluster::127.0.0.1:6379::rq_total::3

Pawan-Bishnoi avatar Mar 02 '24 11:03 Pawan-Bishnoi

Can I work on this? I see it marked as help wanted

pratyushprakash avatar Mar 04 '24 19:03 pratyushprakash

The referenced PR per_endpoint_stats has marked #21685 as closed. In that issue, @vandyvilla mentioned that they needed per-upstream-host stats specifically for Redis cluster. How could it be that this isn't implemented for Redis yet if the referenced PR closed a bug requesting these stats specifically for Redis? What outstanding work is projected to be required?

miroswan avatar Mar 17 '24 06:03 miroswan

Can I work on this? I see it marked as help wanted

I think you can @pratyushprakash 😄

Pawan-Bishnoi avatar Jun 28 '24 06:06 Pawan-Bishnoi