grpc-go icon indicating copy to clipboard operation
grpc-go copied to clipboard

ringhash: fix bug where ring hash can be stuck in transient failure despite having available endpoints

Open atollena opened this issue 7 months ago • 2 comments

See https://github.com/grpc/grpc-go/issues/7363 for a problem description. Added test cases reproduce the error without the fix, but not reliably, since the ring may be constructed in a way where we don't get stuck. However running the test multiple times definitely end up triggering the problem.

The solution is to keep a separate list of available addresses, and when there are no picks and we trigger connection attempts, try them one at a time.

RELEASE NOTES:

  • ringhash: fix bug that could prevent the balancer to recover from transient failure.

atollena avatar Jun 28 '24 10:06 atollena