grpc-go
grpc-go copied to clipboard
ringhash: fix bug where ring hash can be stuck in transient failure despite having available endpoints
See https://github.com/grpc/grpc-go/issues/7363 for a problem description. Added test cases reproduce the error without the fix, but not reliably, since the ring may be constructed in a way where we don't get stuck. However running the test multiple times definitely end up triggering the problem.
The solution is to keep a separate list of available addresses, and when there are no picks and we trigger connection attempts, try them one at a time.
RELEASE NOTES:
- ringhash: fix bug that could prevent the balancer to recover from transient failure.