grpc-go icon indicating copy to clipboard operation
grpc-go copied to clipboard

grpc: Hold mutex while calling resetTransport to prevent concurrent connection attempts

Open arjan-bal opened this issue 1 year ago • 7 comments

This change ensures that caller of resetTransport() keep holding the mutex instead of releasing it and having resetTransport() re-acquire it. This ensures that no concurrent requests are able to start once caller of resetTransport does some validation and calls resetTransport.

See https://github.com/grpc/grpc-go/issues/7365#issuecomment-2208258873 for more details.

Tested

Verified that Test/AuthorityRevive no longer flakes for 100000 attempts with the change.

Fixes: https://github.com/grpc/grpc-go/issues/7365

RELEASE NOTES:

  • Fix race condition that could lead to multiple transports being created in parallel. Only one transport is closed when the subConn is updated and the other transports are orphaned. The orphaned transports may get closed if the server returns an error.

arjan-bal avatar Jul 04 '24 07:07 arjan-bal

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 81.35%. Comparing base (daab563) to head (76ef33f).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7390      +/-   ##
==========================================
- Coverage   81.51%   81.35%   -0.17%     
==========================================
  Files         348      348              
  Lines       26744    26741       -3     
==========================================
- Hits        21801    21754      -47     
- Misses       3764     3793      +29     
- Partials     1179     1194      +15     
Files Coverage Δ
clientconn.go 92.40% <100.00%> (-0.99%) :arrow_down:

... and 23 files with indirect coverage changes

codecov[bot] avatar Jul 04 '24 07:07 codecov[bot]

Release notes needs to be prefixed with "package name:" in this case balancer:?

purnesh42H avatar Jul 05 '24 04:07 purnesh42H

Release notes needs to be prefixed with ":" in this case balancer:?

This is a change in the outermost grpc package. Added the package name in the release notes.

arjan-bal avatar Jul 05 '24 06:07 arjan-bal

is Test/ServerSideXDS_WithValidAndInvalidSecurityConfiguration failure related to change or just a flake?

purnesh42H avatar Jul 08 '24 10:07 purnesh42H

is Test/ServerSideXDS_WithValidAndInvalidSecurityConfiguration failure related to change or just a flake?

It's a flake, there's an existing issue for this and I've commented on the issue about this failure: https://github.com/grpc/grpc-go/issues/6914

arjan-bal avatar Jul 08 '24 10:07 arjan-bal

RELEASE NOTES:

  • Fix race condition that could lead to multiple transports being created in parallel.

What is the user-visible symptom here? A memory leak? Or just an extra connection that was attempted, but that will quickly go away on its own anyway? (Or will the extra connection stick around until the channel is closed?)

dfawley avatar Jul 08 '24 19:07 dfawley

What is the user-visible symptom here? A memory leak? Or just an extra connection that was attempted, but that will quickly go away on its own anyway? (Or will the extra connection stick around until the channel is closed?)

What I'm seeing in the test is that only one transport is closed when the subConn is updated (ac.transport) and the other transport is orphaned. The orphaned transports get closed when the server is shutdown at the end of the test. Updated the release notes.

arjan-bal avatar Jul 09 '24 06:07 arjan-bal