ouroboros-network
ouroboros-network copied to clipboard
Selection preassure for peers with long chainsync timeout.
When a connection is promoted to hot a timeout will be randomly picked from the array [90, 135, 180, 224, 269] to be used by the chainsync protocol. When there is a gap in block production the timeout will trigger and the peer will be demoted to cold. The idea is that during a gap in block production only a subset of peers will be replaced. ~~This scheme works fine with the static peers when running in non-p2p mode~~.
In the p2p case there is a tendency for the set of hot peers to contain more and more peers with long chainsync timeout. Example: A node starts with 20 hot peers with the following timeouts [4 x 90, 4 x 135, 4 x 180, 4 x 224, 4 x 269]. There is a 91s long gap in block production. This means that the four peers with 90s timeout are replaced with four new peers with random timeouts. This happens for all p2p nodes, timed out peers are replaced with peers with new random timeouts.
This means that peers with large timeout accumulates in the set of hot peers in all nodes. When a 224s gap finally happens it isn't 20% of all peers being replaced but it could be 30% or 40%.
Instead of using a constant timeout for the lifetime of the connections it would be better if a timeout could be randomly picked by the chainsync protocol as it prepares to wait for the peer to present it with a new tip.