go-redis Return early when context signals done

Hi! My app's logs are full of errors that look like this:

redis: 2022/06/18 01:21:35 sentinel.go:587: sentinel: GetMasterAddrByName name="gprd-redis" failed: context canceled
redis: 2022/06/18 01:21:35 sentinel.go:587: sentinel: GetMasterAddrByName name="gprd-redis" failed: context canceled
redis: 2022/06/18 01:21:35 sentinel.go:587: sentinel: GetMasterAddrByName name="gprd-redis" failed: context canceled
redis: 2022/06/18 01:21:35 sentinel.go:587: sentinel: GetMasterAddrByName name="gprd-redis" failed: context canceled
<my app's message>: redis: all sentinels specified in configuration are unreachable

Almost always there are several identical log lines with the same timestamp - one per configured sentinel. I looked at the code and noticed that when context is done, the code keep iterating and trying other sentinels. When it runs out of sentinels to try, it returns the final error and my app logs it.

I'd like to improve the behavior and just return the error from the context in this case so that the app can recognize the situation and react accordingly. In my case "accordingly" means not sending the error to Sentry and not logging it, but instead just returning a proper response code to the client.

WDYT?

Thanks for an awesome library!

p.s. While working on this PR is occurred to me that it's a bit weird to use the passed context to connect to sentinel/etc. Wouldn't it be better to do it asynchronously and have a working sentinel/server/whatever connected to and ready to go when a library's method is called? I think this is how gRPC works - it manages the pool of healthy underlying TCP connections asynchronously and pick one when necessary. If none available, it blocks until one becomes available OR context is done. If context is done, it returns early but still tries to establish a connection in the background.

Jun 22 '22 09:06 ash2k

Hi,

he code keep iterating and trying other sentinels

I think we should just stop iterating on ctx.Canceled errors, e.g.

for {
    err := something()
    if err == context.Canceled {
        break
    }
}

What do you think?

Jun 27 '22 06:06 vmihailenco

@vmihailenco You are right, I've looked at the code a bit more and refactored it to only check errors.

Jun 27 '22 10:06 ash2k

Not the tests are failing. Not sure why...

Jun 27 '22 10:06 ash2k

Ok, found and fixed the problem. One of the build actions still fails but it's because of an I/O timeout. Looks unrelated to the code I changed.

Jul 03 '22 02:07 ash2k

PTAL @vmihailenco

Jul 22 '22 03:07 ash2k

Thanks

Oct 05 '22 07:10 vmihailenco

go-redis go-redis copied to clipboard

Return early when context signals done

go-redis
go-redis copied to clipboard