besu
besu copied to clipboard
peer chain state height estimate no longer a good measure of how good a peer is
situation on holesky - besu has pivot block 1278496 then disconnects a holesky bootnode ac906 because bootnode peer has chain height estimate 1278535 which is below pivotBlock 1278691 right now holesky is up to 1279130 (so assuming bootnode would have that block height) what's weird is
- bootnode peer is reporting chain height estimate less than current chain height (so maybe that chain height estimate is not so useful any more, at least in initial sync?)
- we can disconnect the bootnode (which is at that moment our only peer) because of this line - without any regard for how many peers we currently have, the logic being if none of our peers can give us the pivot block they must all be useless so we may as well disconnect one
` {"@timestamp":"2024-04-04T09:06:31,874","level":"INFO","thread":"nioEventLoopGroup-3-1","class":"SnapWorldStateDownloader","message":"Downloading world state from peers for pivot block 1278496 (0x4f5ee30c1f0c142a0760d0db9ca663ff4a3fc3a2ae8b66ea4b4398080ec28104). State root 0x4ffc56dc9bd37bad19bde9c85b5829bbe620116edf44544bac51dc53e97629a6 pending requests 0","throwable":""}
{"@timestamp":"2024-04-04T10:03:32,795","level":"INFO","thread":"EthScheduler-Services-42 (importBlock)","class":"SyncTargetManager","message":"Best peer 0xac906289e4b7f12d... has chain height 1278535 below pivotBlock height 1278691. Waiting for better peers. Current 1 of max 25","throwable":""} {"@timestamp":"2024-04-04T10:03:37,801","level":"INFO","thread":"EthScheduler-Timer-0","class":"SyncTargetManager","message":"Unable to find sync target. Currently checking 0 peers for usefulness.","throwable":""} {"@timestamp":"2024-04-04T10:05:37,815","level":"INFO","thread":"EthScheduler-Timer-0","class":"SyncTargetManager","message":"Unable to find sync target. Currently checking 0 peers for usefulness.","throwable":""} {"@timestamp":"2024-04-04T10:07:37,821","level":"INFO","thread":"EthScheduler-Timer-0","class":"SyncTargetManager","message":"Unable to find sync target. Currently checking 0 peers for usefulness.","throwable":""} {"@timestamp":"2024-04-04T10:09:37,827","level":"INFO","thread":"EthScheduler-Timer-0","class":"SyncTargetManager","message":"Unable to find sync target. Currently checking 0 peers for usefulness.","throwable":""} `
and with some extra debug logging - it seems bootnodes CAN be initialized with chain height of zero -
sallymacfarlane@dev-elc-besu-teku-holesky-dev-sally-peer-compar-3:~$ grep "chain height" /var/log/besu/besu.log {"timeMillis":1712292067240,"thread":"nioEventLoopGroup-3-1","level":"INFO","loggerName":"org.hyperledger.besu.ethereum.eth.sync.fastsync.SyncTargetManager","message":"Best peer 0xac906289e4b7f12d... has chain height 0 below pivotBlock height 1283699 even with tolerance 700. Waiting for better peers. Current 1 of max 25","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","contextMap":{},"threadId":86,"threadPriority":10} {"timeMillis":1712292067274,"thread":"nioEventLoopGroup-3-2","level":"INFO","loggerName":"org.hyperledger.besu.ethereum.eth.sync.fastsync.SyncTargetManager","message":"Best peer 0xa3435a0155a3e837... has chain height 0 below pivotBlock height 1283699 even with tolerance 700. Waiting for better peers. Current 2 of max 25","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","contextMap":{},"threadId":88,"threadPriority":10} {"timeMillis":1712292767306,"thread":"EthScheduler-Services-16 (checkNewPivotBlock-Code)","level":"DEBUG","loggerName":"org.hyperledger.besu.ethereum.eth.sync.snapsync.DynamicPivotBlockSelector","message":"Searching for a new pivot: current pivot 1283876 best chain height 1283970 distance next pivot 63 last pivot block found 1283907 (0xc6f25411221fe0d187dce61ca72e74290267d938ff50eb42d2dedd542fa1f72a)","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jEventBuilder","contextMap":{},"threadId":63,"threadPriorit
exploratory branch https://github.com/hyperledger/besu/compare/main...macfarla:besu:chain-height-estimate-tolerance
This is a good point for mainnet, since the merge, there is no more block broadcasting, and so the peer estimate height is no more updated on regular basis and can be stale, but for PoW and PoA networks this could create issues, so I think we should go a step forward and rethink the whole sync target strategy and the pivot block for PoS networks, as I try to explain below.
On PoS network, the chain head is sent by the CL, so we should just use it as pivot block, try to download that block from any peer and use it as sync target, and ignore the current ways of finding the sync target based on the estimated height.
closing this issue since it's been addressed in #7162