snarkOS icon indicating copy to clipboard operation
snarkOS copied to clipboard

[TOB] Clean up the list of connecting peers in case of handshake timeout

Open ljedrz opened this issue 1 year ago • 3 comments

Supersedes https://github.com/AleoHQ/snarkOS/pull/2690 as a much simpler and tighter solution.

I verified that this works by manually reducing the handshake timeout to 1ms and checking the number of connecting peers in 2 nodes after a timed out handshake between them (which is non-zero in testnet3 and zero with this PR). While adding a regular test case for this would be possible, it would be quite verbose (as one of the nodes would need to use a custom handshake impl), so I didn't include one.

This change should obviate the current extra Disconnect check, but I'm leaving it in for now just to be sure we're no longer hitting it.

Finding: TOB-ALEO-22

ljedrz avatar Oct 03 '23 10:10 ljedrz

Rebased and applied the same fix for the Gateway.

ljedrz avatar Oct 11 '23 08:10 ljedrz

It probably works but isn't a great design.

howardwu avatar Nov 10 '23 00:11 howardwu

Alternatives are difficult, because there is no high-level callsite where an inbound connection timeout can be handled; having that would break the modularity of the network stack, and would be significantly more complex than this proposal.

ljedrz avatar Nov 10 '23 09:11 ljedrz