go-redis
go-redis copied to clipboard
Return early when context signals done
Hi! My app's logs are full of errors that look like this:
redis: 2022/06/18 01:21:35 sentinel.go:587: sentinel: GetMasterAddrByName name="gprd-redis" failed: context canceled
redis: 2022/06/18 01:21:35 sentinel.go:587: sentinel: GetMasterAddrByName name="gprd-redis" failed: context canceled
redis: 2022/06/18 01:21:35 sentinel.go:587: sentinel: GetMasterAddrByName name="gprd-redis" failed: context canceled
redis: 2022/06/18 01:21:35 sentinel.go:587: sentinel: GetMasterAddrByName name="gprd-redis" failed: context canceled
<my app's message>: redis: all sentinels specified in configuration are unreachable
Almost always there are several identical log lines with the same timestamp - one per configured sentinel. I looked at the code and noticed that when context is done, the code keep iterating and trying other sentinels. When it runs out of sentinels to try, it returns the final error and my app logs it.
I'd like to improve the behavior and just return the error from the context in this case so that the app can recognize the situation and react accordingly. In my case "accordingly" means not sending the error to Sentry and not logging it, but instead just returning a proper response code to the client.
WDYT?
Thanks for an awesome library!
p.s. While working on this PR is occurred to me that it's a bit weird to use the passed context to connect to sentinel/etc. Wouldn't it be better to do it asynchronously and have a working sentinel/server/whatever connected to and ready to go when a library's method is called? I think this is how gRPC works - it manages the pool of healthy underlying TCP connections asynchronously and pick one when necessary. If none available, it blocks until one becomes available OR context is done. If context is done, it returns early but still tries to establish a connection in the background.
Hi,
he code keep iterating and trying other sentinels
I think we should just stop iterating on ctx.Canceled errors, e.g.
for {
err := something()
if err == context.Canceled {
break
}
}
What do you think?
@vmihailenco You are right, I've looked at the code a bit more and refactored it to only check errors.
Not the tests are failing. Not sure why...
Ok, found and fixed the problem. One of the build actions still fails but it's because of an I/O timeout. Looks unrelated to the code I changed.
PTAL @vmihailenco
Thanks