feat(discovery): discover triggered too frequently
In the case a node doesn't have its quota of peers, it will send a new discover request every second. There are no guarantees that the discover request can complete within 1 second.
https://github.com/celestiaorg/celestia-node/blob/accb0589d2de42a44de43e0974610d3a0186976d/share/p2p/discovery/discovery.go#L225-L229
This interval seems too aggressive and should probably be increased (e.g to 1 or even 5 minutes?)
@guillaumemichel i agree that the interval for retrying is a bit too aggressive and could be increased, but the actual deadline to FindPeers is a minute.
https://github.com/celestiaorg/celestia-node/blob/accb0589d2de42a44de43e0974610d3a0186976d/share/p2p/discovery/discovery.go#L286-L293
So basically there will always be a lookup running until enough peers are discovered.
So basically there will always be a lookup running until enough peers are discovered.
IIRC, that was the intention, but as you mentioned that might be too aggressive
Sometimes, a node fails to discover any peers on startup. Without discovered peers, the node is unable to perform most of its P2P logic. The downside of a longer cooldown is that it will cause the node application to halt for the cooldown duration in such cases. If some peers are discovered, the node should still aim to find at least 5 (our default) to rely less on the performance and availability of a single peer.
The idea behind aggressive retries is to bootstrap node into a stable network condition as soon as possible, perhaps at the cost of more resources spent on aggressive discovery. So I think the defaults should stay low.
I think it might be valuable for some users to have the ability to increase retry/timeout values if they are less concerned about the node being connected to the FN network.