grpc-go icon indicating copy to clipboard operation
grpc-go copied to clipboard

pickfirst: Implement Happy Eyeballs

Open arjan-bal opened this issue 1 year ago • 4 comments

As part of the Dualstack design, the pickfirst policy should implement the happy eyeballs algorithm while connecting to multiple backends.

The timeout for the happy eyeballs connection timer is NOT configurable as that's an optional requirement in the gRFC.

RELEASE NOTES:

  • The new experimental pickfirst LB policy (disabled by default) supports Happy Eyeballs to attempt connections to multiple backends concurrently. The experimental pickfirst policy can be enabled by setting the environment variable GRPC_EXPERIMENTAL_ENABLE_NEW_PICK_FIRST to true.

arjan-bal avatar Oct 10 '24 10:10 arjan-bal

Codecov Report

Attention: Patch coverage is 86.81319% with 12 lines in your changes missing coverage. Please review.

Project coverage is 81.74%. Comparing base (18d218d) to head (5c4ff49). Report is 9 commits behind head on master.

Files with missing lines Patch % Lines
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go 86.36% 9 Missing and 3 partials :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7725      +/-   ##
==========================================
- Coverage   82.00%   81.74%   -0.27%     
==========================================
  Files         373      374       +1     
  Lines       37735    37930     +195     
==========================================
+ Hits        30945    31004      +59     
- Misses       5512     5615     +103     
- Partials     1278     1311      +33     
Files with missing lines Coverage Δ
balancer/pickfirst/internal/internal.go 100.00% <100.00%> (ø)
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go 88.83% <86.36%> (+0.06%) :arrow_up:

... and 39 files with indirect coverage changes

codecov[bot] avatar Oct 10 '24 10:10 codecov[bot]

Should we mention the environment variables in the release note? Or at least in the PR description?

easwars avatar Oct 10 '24 19:10 easwars

Should we mention the environment variables in the release note? Or at least in the PR description?

Updated the release notes.

arjan-bal avatar Oct 11 '24 06:10 arjan-bal

Quoting this paragraph from A61: " Note that because we do not report TRANSIENT_FAILURE until after the Happy Eyeballs pass has completed and we start a new Happy Eyeballs pass whenever we receive a new address list, there is a potential failure mode where we may never report TRANSIENT_FAILURE if we are receiving new address lists faster than we are completing Happy Eyeballs passes. This is a pre-existing problem, and each gRPC implementation currently deals with it in its own way. This design does not propose any changes to those existing approaches, although a future gRFC may attempt to achieve further convergence here. "

I guess we never had any logic/code to handle this prior to Happy Eyeballs. Do have an idea of how to handle this?

This failure mode exists in the existing pickfirst which calls addrConn.updateAddrs, to handle this scenario. I'm not sure if grpc-go does anything to prevent this. The DNS resolver has throttling, but that's not a universal fix. Java doesn't have anything to handle this in the pickfirst LB policy, though the channel may have a way of preventing this.

Let me start a discussion in the cross language group.

arjan-bal avatar Oct 17 '24 06:10 arjan-bal