go-redis icon indicating copy to clipboard operation
go-redis copied to clipboard

go-redis cluster: why updateLatency use ping without a timeout

Open smartnews-weitao opened this issue 3 years ago • 8 comments

So it's actually not a bug, but i am wondering why inside the updateLatency function, use the ping method without a timeout instead of with a timeout? Because i am thinking if one of the node is failed, and i choosed RouteByLatency, then the request still might route to the failed node if his performance was good before the failure. If we have a 1 sec timeout, will it work better in this case?

smartnews-weitao avatar Jan 29 '21 03:01 smartnews-weitao

Sounds reasonable.

behappyforever avatar Jan 29 '21 07:01 behappyforever

Sounds reasonable.

Will you guys implement this soon or do you think it's not a general issue and i should fork and fix on my own version?

smartnews-weitao avatar Jan 29 '21 07:01 smartnews-weitao

Sounds reasonable.

Will you guys implement this soon or do you think it's not a general issue and i should fork and fix on my own version?

We have set RouteByLatency false in our production enviourment. And I'd like to see this PR been merged.

behappyforever avatar Jan 29 '21 07:01 behappyforever

You are supposed to set Read/WriteTimeout so all commands finish eventually.

vmihailenco avatar Jan 29 '21 13:01 vmihailenco

You are supposed to set Read/WriteTimeout so all commands finish eventually.

My intention is: let the ping command inside updateLatency func finish within a timeout so the client can detect the failure of a node and chose not to route following requests to that node.

By setting Read timeout to 1s at opt level, you sure the ping(context.TODO()) will also finish within a second right? Is it because the timeout has been set at connection level?

smartnews-weitao avatar Jan 30 '21 01:01 smartnews-weitao

By setting Read timeout to 1s at opt level, you sure the ping(context.TODO()) will also finish within a second right? Is it because the timeout has been set at connection level?

Yes, context is relatively recent addition and before it was added Read/WriteTimeout was the only and main way to ensure the connection does not hang.

vmihailenco avatar Jan 30 '21 07:01 vmihailenco

By setting Read timeout to 1s at opt level, you sure the ping(context.TODO()) will also finish within a second right? Is it because the timeout has been set at connection level?

Yes, context is relatively recent addition and before it was added Read/WriteTimeout was the only and main way to ensure the connection does not hang.

Thank you, i have another question. What's the behavior if i use a cluster client call a non-key command, like scan/info memory? Is there's an efficient way that i can scan the entire cluster with a prefix?

smartnews-weitao avatar Jan 31 '21 03:01 smartnews-weitao

What's the behavior if i use a cluster client call a non-key command, like scan/info memory?

AFAIK it chooses a random node and runs the command on it.

Is there's an efficient way that i can scan the entire cluster with a prefix?

I don't know, but perhaps using ForEachMaster with Scan.

vmihailenco avatar Feb 02 '21 12:02 vmihailenco

This issue is marked stale. It will be closed in 30 days if it is not updated.

github-actions[bot] avatar Sep 22 '23 00:09 github-actions[bot]