snarkOS icon indicating copy to clipboard operation
snarkOS copied to clipboard

[Bug] Duplicate block requests when validators sync

Open kpandl opened this issue 1 year ago • 0 comments

🐛 Bug Report

When a validator syncs, it sends out a duplicate block request for a certain height, despite already having received a response for that height from the earlier request.

Specifically, it requests blocks of a certain range (e.g., 1 to n+m with m>1), and then advances to block n. Then, it sends out a block request for height n+1 again, despite it already being in the initial block range request.

The bug slows down the syncing of validators.

Steps to Reproduce

Run the ./devnet.sh script with 4 validators, turn off 1 validator. Let it run for ~30 blocks. Then, start the fourth validator and let it sync. Consider adding logs to snarkOS for sync profiling.

Expected Behavior

The validator should use the block already sent to it and advance to it (and further).

Your Environment

  • snarkOS branch kp/profile/sync/ in ProvableHQ/snarkOS
  • macOS

Debugging notes

  • In sync_storage_with_blocks, there are two places where we advance with blocks - without and with BFT checks.
  • The issue is for cases where we advance with BFT checks. sync_storage_with_block does not always update the canon ledger. Thus, self.canon can't necessarily tell until which height we already parsed responses.
  • While the block is still in self.latest_block_responses, this member is in Sync and not in BlockSync. A refactor should consider the entire syncing logic, and how Sync and BlockSync relate.

kpandl avatar Sep 24 '24 14:09 kpandl