go-libp2p icon indicating copy to clipboard operation
go-libp2p copied to clipboard

websocket: add reuseport support

Open chaitanyaprem opened this issue 2 years ago • 21 comments

Implemented reuse-port capability for websocket transport #1435 using the existing net/reuseport transport and libp2p/go-reuseport.

In order to utilize REUSEPORT, reuseport option has to be passed while creating the websocket transport. This would allow for multiple threads/routines to listen on same IP/port combination.

Included a test that would verify reuseport by running 2 go-routines that would listen on same address and accept connections from 4 different clients.

Reran the existing test-suite for the websocket package to ensure no other tests fail.

Note: Tested the new changes in ubuntu-20.04 and windows-11 environment.

chaitanyaprem avatar Apr 26 '23 07:04 chaitanyaprem

Thank you for your PR! This mostly looks good, just a few comments.

You are welcome. Pushed new commits addressing your review comments.

chaitanyaprem avatar Apr 27 '23 14:04 chaitanyaprem

@marten-seemann please approve the workflows post which this can be merged.

chaitanyaprem avatar May 02 '23 05:05 chaitanyaprem

Thank you @chaitanyaprem! The code looks good to me, but I'm a bit confused with the test case.

Unfortunately, we just switched back to the Gorilla WebSocket library (#2280). This is creating merge conflicts with this PR. Would you mind rebasing this PR on top of the current master?

Will resolve and push updated ones.

chaitanyaprem avatar May 16 '23 05:05 chaitanyaprem

Pushed updated changes as per review comments.

chaitanyaprem avatar May 21 '23 01:05 chaitanyaprem

Nice PR! (commenting as an outsider)

Q: This PR adds the ability to listen on the same port by multiple parties. The TCP implementation for reuseport will also try to use listening ports also for dialing. This is not handled in this PR, is that right?

Thanks @Maelkum for your comments. Addressed all the others. You are right that reuseport is not added for dialing as it doesn't seem straight-forward with gorilla websocket. Will see how to get around that.

chaitanyaprem avatar May 23 '23 05:05 chaitanyaprem

Wow, this is surprising that load balancing across multiple listeners is not functioning on windows as well as Mac even-though 2 sockets are allowed to listen on same address+port.

Wrt Windows, i found this ref which does not guarantee the behaviour of SO_REUSEADDR. Quote from windows ref is not very clear wrt load balancing. For example, if all of the sockets on the same port provide TCP service, any incoming TCP connection requests over the port cannot be guaranteed to be handled by the correct socket — the behavior is non-deterministic.

Has the TCP reuse port been tested for load balancing?

Note: While going through the web searching for expected behaviour for these socket options, noticed that for FreeBSD based systems in order for load balancing to be supported an additional option has to be set SO_REUSEPORT_LB, Ref. This is something to be explored and added to the reuseport package

Will update once i investigate further.

chaitanyaprem avatar May 26 '23 00:05 chaitanyaprem

Update on further investigation, looks like macOS also doesn't support load balancing across listeners. Few refs: Ref1 Ref2

So, for now I have skipped checking for load balancing of connections for any platform other than linux in the test script. Raised an enhancement issue on the go-reuseport repo https://github.com/libp2p/go-reuseport/issues/105 in order to enable LB functionality for FREEBSD based systems by setting SO_REUSEPORT_LB as indicated above.

chaitanyaprem avatar May 26 '23 10:05 chaitanyaprem

Pushed changes to reuse port in case of dailing. Note that in case of wss with SNI, reuse port cannot be used as the dail function has already been replaced in order to not resolve address and directly do a TCP Dial.

chaitanyaprem avatar May 30 '23 08:05 chaitanyaprem

@marten-seemann @MarcoPolo All required changes and review comments have been addressed. Please go ahead and merge this PR if nothing else is required.

chaitanyaprem avatar Jun 03 '23 05:06 chaitanyaprem

@chaitanyaprem The WebSocket tests are failing on CI. Can you please take a look?

marten-seemann avatar Aug 14 '23 03:08 marten-seemann

Surprisingly the test is successful now, i ran it on a local ubuntu vm (Ubuntu 22.04.3 LTS) as well and there also tests are successful.

It is hard to determine whether some other change fixed the issue or is the test just flaky.

chaitanyaprem avatar Aug 17 '23 01:08 chaitanyaprem

I reran the test suite and it's failing again. We can't introduce any flaky tests, please make sure it consistently passes (you can run the test suite with -count=1000 locally).

marten-seemann avatar Aug 17 '23 08:08 marten-seemann

I reran the test suite and it's failing again. We can't introduce any flaky tests, please make sure it consistently passes (you can run the test suite with -count=1000 locally).

Ran it 1000 times and it fails sometimes. Commenting the additional load balancing check for now. Looks like linux is doesn't gaurantee it.

chaitanyaprem avatar Aug 17 '23 08:08 chaitanyaprem

@sukunrt is gonna help take this over the finish line in time for the v0.31 release @chaitanyaprem

p-shahi avatar Aug 25 '23 03:08 p-shahi

@chaitanyaprem just checking in. Let me know if you have any questions here. I'm available in filecoin slack(link in repo README).

sukunrt avatar Sep 15 '23 08:09 sukunrt

@chaitanyaprem just checking in. Let me know if you have any questions here. I'm available in filecoin slack(link in repo README).

Got little busy with other work and hence not able to take a look at this. Thanks for providing a channel to reach out, will do once i start working on this again.

chaitanyaprem avatar Sep 25 '23 07:09 chaitanyaprem

Thanks for the update @chaitanyaprem

sukunrt avatar Sep 26 '23 11:09 sukunrt

Hi guys! I hope this is not me being rude.. At a similar time when this PR was opened, I played around with implementing a similar thing. I rebased those changes to sync with the latest master (at the time this repo used nhooyr.io/websocket).

Tested it mostly in the listen => dial scenario, compared to multiple listens.

Since this PR is open for a while now, I figured I'd throw the link here. Feel free to cherry pick any piece of code out of there if it's useful. Hope this isn't bad manners!

https://github.com/Maelkum/go-libp2p/pull/1

Maelkum avatar Feb 08 '24 14:02 Maelkum

Thanks @Maelkum. This PR has been dormant for a while so I think it's okay. Do you want to open a new PR or take over this one? @chaitanyaprem is it fine if @Maelkum continues the work here?

sukunrt avatar Feb 21 '24 14:02 sukunrt

Thanks @Maelkum. This PR has been dormant for a while so I think it's okay. Do you want to open a new PR or take over this one? @chaitanyaprem is it fine if @Maelkum continues the work here?

Thanks @Maelkum Feel free to takeover, have not been able to get to this one myself.

chaitanyaprem avatar Feb 21 '24 15:02 chaitanyaprem

@sukunrt @chaitanyaprem Thanks guys! I'll open a new PR and link this one there, I think it will be tidier that way.

Maelkum avatar Feb 24 '24 16:02 Maelkum