hsd icon indicating copy to clipboard operation
hsd copied to clipboard

HSD should not retry failed outbound connection indefinitely

Open HarukaMa opened this issue 5 years ago • 4 comments

Currently there are many nodes in seed list which are not online, however HSD would just try to connect to them forever if it have enough outbound connection allowance.

I think HSD should give up after a few tries unless there are really not enough connections to run the node.

HarukaMa avatar Apr 13 '20 12:04 HarukaMa

If hsd stops trying to make outbound connections, then it creates risk around eclipse attacks. If you are annoyed by the logging messages, run with --log-level info or apply this commit: https://github.com/tynes/hsd/commit/ea202ee4a792cd0355e0969bd62800b50eb95552

tynes avatar Apr 13 '20 22:04 tynes

I think it’s not about “stop trying to” completely. What I mean is if a node is known to be inaccessible even if it’s in the seed list, HSD should stop trying to connect to that node or implement something like exponential back off. I suppose HSD could still try to connect to newly discovered nodes. Unless there is no peer exchange mechanism, but I think there actually is (addr messages).

HarukaMa avatar Apr 14 '20 04:04 HarukaMa

I understand what you mean now, when a connection fails then it should deprioritize that address in the future. It would be interesting to write a script to scan through the hostlist and try to connect to a bunch of nodes to see how much of the hostlist is full of nodes that are either not listening or are no longer alive. I think having that kind of data could really help to inform the situation.

You can see here that the host is marked in the HostList as being ack'd after a connection is opened: https://github.com/handshake-org/hsd/blob/ed53f5f885af7488633adaf06e0ddfdd6f0be5b8/lib/net/pool.js#L1450

Here is where an attempt to connect to a peer is kept track of: https://github.com/handshake-org/hsd/blob/ed53f5f885af7488633adaf06e0ddfdd6f0be5b8/lib/net/pool.js#L1177 https://github.com/handshake-org/hsd/blob/489d3f78f7b9ee6e82168fc0c4ce65569853c71d/lib/net/hostlist.js#L705

When adding new addrs to the HostList, the stale ones are pruned if entry.attempts >= HostList.RETRIES: https://github.com/handshake-org/hsd/blob/489d3f78f7b9ee6e82168fc0c4ce65569853c71d/lib/net/hostlist.js#L572 https://github.com/handshake-org/hsd/blob/489d3f78f7b9ee6e82168fc0c4ce65569853c71d/lib/net/hostlist.js#L644

There are some return statements earlier in the function, so I don't know exactly how often the codepath will make it that far. It seems like the evictions will only happen if new addrs are being sent around, meaning new nodes coming onto the network.

The logic is here for selecting a host: https://github.com/handshake-org/hsd/blob/489d3f78f7b9ee6e82168fc0c4ce65569853c71d/lib/net/hostlist.js#L377

Maybe some logic around lastSuccess and attempts on the HostEntry object could be helpful when selecting a host. https://github.com/handshake-org/hsd/blob/489d3f78f7b9ee6e82168fc0c4ce65569853c71d/lib/net/hostlist.js#L1429-L1430


Are you running the latest version of hsd? There were some updates to the p2p protocol, it was updated to v4 as an attempt to harden the system. It should automatically migrate to the newer format if you are running a newer tagged release. You could try to move your hosts file into a new directory and then hsd should be able to create a new one and repopulate it. If you do this, do you see a difference at all? Would be curious to know if the hostfiles end up looking similarly.

tynes avatar Apr 14 '20 05:04 tynes

I'm running the node from the tip of master branch from start so I think I'm already using the latest p2p protocol. The node reports version 2.1.5.

HarukaMa avatar Apr 14 '20 05:04 HarukaMa