pocket
pocket copied to clipboard
[P2P] Router bootstrapping
Objective
Clarify bootstrapping requirements, constraints sufficient to align on "correct" behavior and realize "low-hanging" optimization opportunity.
Origin Document
Questions surfaced while working on #732 & #694.
Goals
- Clarify bootstrapping "success"/"failure" conditions
- Reduce time to bootstrap (or fail) in router implementations
- Consider how bootstrapping status is signaled to other modules (esp. if we plan on removing the FSM)
- Account for TTL-base nature of libp2p peerstore
Legend
flowchart
a[State description]
subgraph next[Next state description]
nest[Nested state description]
end
other[Other state]
cond{Condition}
act([Action])
a --> next
next --> cond
cond --"condition value"--> other
cond --"alternative value"--> act
act --"result"--> other
Flowchart
flowchart
start[Node Startup]
subgraph persStart[Persistence Module Start]
hasState{Does this node already\nhave some state?}
gen(["Genesis hydration\n(initial staked actor\nidentities nadded to state)"])
end
hasState --"NO"--> gen
subgraph p2pStart[P2P Module Start]
subgraph l[Libp2p Host Setup]
ll([Libp2p host\nlistening])
end
subgraph sar[Staked Actor Router Setup]
sarHandle([Staked actor router\nprotocol handler\nregistered])
end
subgraph usar[Unstaked actor Router Setup]
usarDisc([Unstaked actor DHT peer\ndiscovery start])
usarHandle([Unstaked actor router\nprotocol handler\nregistered])
usarGossip([Unstaked actor router\nGossipsub setup])
end
end
l --> sar
sar --> usar
bs[P2P Bootstrapping Start]
isBSNode{Is this node\nconfigured as a\nbootstrap node?}
start --> persStart
persStart --> p2pStart
p2pStart --> bs
bs --> isBSNode
isBSNode --"NO"--> bsProg
isBSNode --"YES"--> bsBSNode
firstBS --> bsReachable
bsReachable --"YES"--> rpc
subgraph bsProg[RPC bootstrapping]
firstBS[Considering first configured bootstrap node]
rpc([Get staked peers from\n`rpcPeerstoreProvider`\nusing bootstrap node])
isStaked{Is this node a\nstaked actor?}
firstPeer[Considering first peer]
nextPeer[Considering next peer]
peerReachable{Is the current\npeer healthy?}
morePeers{Are there more peers?}
bsReachable{"is the current\nbootstrap node healthy?"}
addStaked([Add peer to\nstaked actor router])
addUnstaked([Add peer to\nunstaked actor router])
addLibp2p([Add peer to libp2p host])
con([Attempt to connect])
minbs{are >= 3 peers\nconnected?}
bsRetry(["Retry"])
bsAttempts{"Max attempts\nreached for this\npeer?"}
moreBS{Are there more\nconfigured\nbootstrap nodes?}
nextBS[Considering next\nbootstrap node]
end
minbs --"NO"--> bsFail
minbs --"YES"--> bsDone
bsReachable --"NO"--> nextBS
rpc --> firstPeer
firstPeer --> peerReachable
peerReachable --"YES"--> isStaked
isStaked --"YES"--> addStaked
isStaked --"NO"--> addUnstaked
addStaked --> addUnstaked
addUnstaked --> addLibp2p
peerReachable --"NO"--> nextPeer
nextPeer --> peerReachable
addLibp2p --> con
con --"success"--> morePeers
morePeers --"NO"--> moreBS
morePeers --"YES"--> nextPeer
con --"error"--> bsAttempts
nextBS --> bsReachable
bsFail --> nodeFail
bsAttempts --"NO"--> bsRetry
bsAttempts --"YES"--> nextPeer
bsRetry --> con
moreBS --"NO" --> minbs
moreBS --"YES"--> nextBS
subgraph bsBSNode[Bootstrap node setup]
gps["Get staked peers from\n`persistencePeerstoreProvider`\n(last known state; possibly genesis\nor a snapshot)"]
addAllStaked([Add peers to\nstaked actor router])
addAllUnstaked([Add peers to\nunstaked actor router])
addAllLibp2p([Add peers to libp2p\nhost peerstore])
isStaked2{Is this node a\nstaked actor?}
end
gps --> isStaked2
isStaked2 --"YES"--> addAllStaked
isStaked2 --"NO"--> addAllUnstaked
addAllStaked --> addAllUnstaked
addAllUnstaked --> addAllLibp2p
addAllLibp2p --> bsDone
bsFail["P2P Bootrapping Failure"]
nodeFail["Node Startup Failure"]
bsDone["P2P Bootstrapping Success"]
fsm["State Machine Transition\n(`P2P_IsBootstrapped`)"]
rest[...]
bsDone --> fsm
fsm --> rest
Deliverable
- [ ] Determine success/failure bootstrapping condition(s) (e.g. when quorum number of known bootstrap nodes are (un)reachable)
- [ ] Update P2P docs to describe bootstrapping success and failure scenarios
- [ ] Update router bootstrapping implementations respectively
- [ ] Support simultaneous dialing of bootstrap nodes with some "max concurrency" (see: #694)
- [ ] Design bootstrap status signaling mechanism / convention
- [ ] Ensure libp2p peerstore network addresses expire & renew appropriately
- Consider using
AddressTTL
as default - Should staked actor network addresses expire? (see:
PermanentAddrTTL
)
- Consider using
Non-goals / Non-deliverables
- Remove or replace the state machine module
General issue deliverables
- [ ] Update the appropriate CHANGELOG(s)
- [ ] Update any relevant local/global README(s)
- [ ] Update relevant source code tree explanations
- [ ] Add or update any relevant or supporting mermaid diagrams
Testing Methodology
- [ ] All tests:
make test_all
- [ ] LocalNet: verify a
LocalNet
is still functioning correctly by following the instructions at docs/development/README.md - [ ] k8s LocalNet: verify a
k8s LocalNet
is still functioning correctly by following the instructions here
Creator: @bryanchriswhite Co-Owners:
@bryanchriswhite I have not done a deep dive into it, but am aware that libp2p has it's own opinion, approach and tooling to bootstrap (e.g. [1]). Questions are:
- Are you aware of it and/or have looked into it?
- Is it one of the potential options we are, or should, consider?
[1] https://discuss.libp2p.io/t/how-to-create-bootstrap-node-correctly-always-searching-for-other-peers/1389
@bryanchriswhite I have not done a deep dive into it, but am aware that libp2p has it's own opinion, approach and tooling to bootstrap (e.g. [1]). Questions are:
- Are you aware of it and/or have looked into it?
- Is it one of the potential options we are, or should, consider?
[1] https://discuss.libp2p.io/t/how-to-create-bootstrap-node-correctly-always-searching-for-other-peers/1389
TL;DR everything is in terms of pokt address at the highest level at the moment which adds an otherwise unnecessary layer of complexity.
@Olshansk, I am aware of and we are using the go-libp2p-kad-dht package to facilitate unstaked actor (aka background) router bootstrapping. However, until we go libp2p-native with respect to at least peer IDs, we have to ensure that both routers can map a given pokt address to its corresponding public key.
FWIW, my experience has also been that some significant changes have been made to that library in relatively recent history which renders much of the discussions and examples I've encountered irrelevant, including conversations with chatGPT. :confused: Although, I think we're pretty well sorted on that front (see: kad_discovery_baseline_test.go).