pickfirst: Implement Happy Eyeballs
As part of the Dualstack design, the pickfirst policy should implement the happy eyeballs algorithm while connecting to multiple backends.
The timeout for the happy eyeballs connection timer is NOT configurable as that's an optional requirement in the gRFC.
RELEASE NOTES:
- The new experimental
pickfirstLB policy (disabled by default) supports Happy Eyeballs to attempt connections to multiple backends concurrently. The experimentalpickfirstpolicy can be enabled by setting the environment variableGRPC_EXPERIMENTAL_ENABLE_NEW_PICK_FIRSTtotrue.
Codecov Report
Attention: Patch coverage is 86.81319% with 12 lines in your changes missing coverage. Please review.
Project coverage is 81.74%. Comparing base (
18d218d) to head (5c4ff49). Report is 9 commits behind head on master.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| balancer/pickfirst/pickfirstleaf/pickfirstleaf.go | 86.36% | 9 Missing and 3 partials :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #7725 +/- ##
==========================================
- Coverage 82.00% 81.74% -0.27%
==========================================
Files 373 374 +1
Lines 37735 37930 +195
==========================================
+ Hits 30945 31004 +59
- Misses 5512 5615 +103
- Partials 1278 1311 +33
| Files with missing lines | Coverage Δ | |
|---|---|---|
| balancer/pickfirst/internal/internal.go | 100.00% <100.00%> (ø) |
|
| balancer/pickfirst/pickfirstleaf/pickfirstleaf.go | 88.83% <86.36%> (+0.06%) |
:arrow_up: |
Should we mention the environment variables in the release note? Or at least in the PR description?
Should we mention the environment variables in the release note? Or at least in the PR description?
Updated the release notes.
Quoting this paragraph from A61: " Note that because we do not report TRANSIENT_FAILURE until after the Happy Eyeballs pass has completed and we start a new Happy Eyeballs pass whenever we receive a new address list, there is a potential failure mode where we may never report TRANSIENT_FAILURE if we are receiving new address lists faster than we are completing Happy Eyeballs passes. This is a pre-existing problem, and each gRPC implementation currently deals with it in its own way. This design does not propose any changes to those existing approaches, although a future gRFC may attempt to achieve further convergence here. "
I guess we never had any logic/code to handle this prior to Happy Eyeballs. Do have an idea of how to handle this?
This failure mode exists in the existing pickfirst which calls addrConn.updateAddrs, to handle this scenario. I'm not sure if grpc-go does anything to prevent this. The DNS resolver has throttling, but that's not a universal fix. Java doesn't have anything to handle this in the pickfirst LB policy, though the channel may have a way of preventing this.
Let me start a discussion in the cross language group.