celestia-node
celestia-node copied to clipboard
feat(p2p): Extract bootstrapping into separate module
Implementation ideas
The current implementation of the Celetia node disperses the logic for maintaining alive connections to other nodes across multiple packages, notably within the go-header package and discovery. This approach has led to inefficiencies, particularly in scenarios where the node could become isolated from its peers due to networking issues. Presently, if the node loses all connections and remains disconnected for a prolonged period, it may fail to re-initiate connections to known peers or bootstrappers, effectively stalling its network participation.
This issue proposes the creation of a dedicated module responsible for aggregating all connection management logic. This module will ensure the node maintains at least one active connection to either bootstrappers or previously seen peers upon launch. Moreover, it will actively manage reconnections to the network if the node finds itself disconnected from all peers, addressing the current limitation where the node might cease attempting to reconnect.
Objectives:
- Create module within the Celestia node that ensures there is alive connection to network.
- Extract and consolidate existing connection logic from the go-header package and any other relevant parts of the codebase into this new module.
- Ensure the module initiates at least one connection to known peers or bootstrappers upon node startup.
- Implement a mechanism within the module to detect loss of all connections and automatically attempt reconnection to the network, using a list of previously successful connections and bootstrappers.
I don't think we need "centralized" connection management, and I don't think that the current approach leads to inefficiencies. The libp2p already centralizes the connection management logic. Doing another central connection management will lead to disastrous Tendermint 0.35 that over-engineered another connection management layer over libp2p, and that didn't work.
This approach has led to inefficiencies, particularly in scenarios where the node could become isolated from its peers due to networking issues.
I don't know of any "inefficiencies" the canonical libp2p protocol architecture we follow has led to, and we need more proofs in this issue in order to change that.
What we originally discussed in DMs and agreed to make an issue was much simpler and much more shallow in its purpose. We need a component that ensures we have connections. Something that detects there are no connections and perpetually tries to re-establish with known peers or bootstrappers with a single goal of sustaining extended network losses.
I think we are in same page here. Connection management functions are described in 3,4 and are pretty simple and straightforward. Those functions also could be removed from go-header, as they will be managed in new module. Maybe “centralized connection manager” is a bit too powerfull of a name for such component. Naming can be reconsidered by implementers.
It's not only naming issue as long as the issue states in its objectives the unified connection management module
Even though I am grateful that you opened this issue, I believe it's overcomplicated and misleading
We should also discuss the need for how the 2 would work. Currently go-header manages that for a good reason, it defines and manages the notions of trusted peers that we cannot easily extract elsewhere, but I agree its worth looking into it in long term perspective.
One of the way this could be managed is if header module used new module as a dependency in constructors. Trusted peers as well as connected peers might be accessed by new module api. We will need to define interface in go-header side, so modules can be easily decoupled. This will allow proper module initialization order and will allow go-header components to be sure, that connections are established by the time they start.