go-redis icon indicating copy to clipboard operation
go-redis copied to clipboard

FailoverClient may have unreasonable retry when meeting a redis sentinel master switch

Open wyxloading opened this issue 2 months ago • 3 comments

Expected Behavior

go-redis v9.14.0

command invocation using a FailoverClient with retry setting like this should retry until master switch operation done and send commands to new master within about 30 seconds. Actually it should cost less than 30s cause the master switch operation done in about 5 seconds.

MaxRetries:      39,
MinRetryBackoff: time.Millisecond,
MaxRetryBackoff: time.Second,

Current Behavior

Some command call will cost more than 30s ( longer than the retry settings ). That we call it unreasonable retry, cause the master switch operation done in about 4-5 seconds in our case. Mostly command call should retry and eventually done in about 5-6s at most.

Possible Solution

https://github.com/redis/go-redis/blob/v9.14.0/redis.go#L353-L388

getConn get an old master connection, before initConn, then master switch occured, that connection would be closed. Then in call initConn would wrap that connection into a SingleConnPool, and wrap into a new baseClient with the same retry settings.

If it call Hello command, it would find out that connection is already closed ( not health connection ), may get a EOF error, which result in shouldRetry true. Then it would do unreasonable retry until attempt times larger than MaxRetries setting.

Maybe we should not retry if we got a badConnError and baseClient.connPool is either SingleConnPool nor StickyConnPool

Context (Environment)

go-redis v9.14.0, with redis 5 sentinel server

wyxloading avatar Sep 30 '25 06:09 wyxloading

Hello @wyxloading Thanks for reporting this issue. We will investigate. In the meantime, if you have a code snippet that resolves the issue, feel free to open a PR

htemelski-oss avatar Oct 02 '25 07:10 htemelski-oss

Hello @wyxloading I was not able to reproduce this, could you please provide a snippet to reproduce it?

ndyakov avatar Oct 31 '25 13:10 ndyakov

hi, sorry for the late response. check https://github.com/wyxloading/go-redis-issue-3540-example for the reproduce steps

wyxloading avatar Dec 02 '25 08:12 wyxloading