snarkOS icon indicating copy to clipboard operation
snarkOS copied to clipboard

Update syncing logic to fix duplicate block requests

Open kpandl opened this issue 1 year ago • 0 comments

Motivation

As issue #3404 described, validators send out duplicate block requests when syncing. This PR updates the syncing logic to make sure we only remove a block response after it was added to the ledger or if we encountered an error.

For background, currently, syncing works as follows: BlockSync requests the blocks, and stores it in the internal responses map. Sync calls block_sync.process_next_block(current_height) to get the next block to process. This removes it from BlockSync’s responses and returns it to Sync. Sync performs the checks, and if successful, adds the block to the ledger. This also updates BlockSync’s canon.

The issue is that BlockSync‘s view of the canon is only updated after Sync is done processing blocks (and adding to the ledger). There’s a window where BlockSync is unaware of blocks that are pending validation in Sync. Thus, it re-requests them.

This PR adjusts the block processing interface to the Sync module as follows:

  • process_next_block READS a block from a particular height
  • remove_block_response REMOVES a block from a particular height, which MUST be called after advancing completed or failed

Test Plan

Ran it locally and verified it removes the duplicate block request.

The following figure shows a syncing process before the fix, with a longer pause until a duplicate request is generated: syncing_not_fixed

The following figure shows the syncing process with the fix (note there is no double request and no delay, as such, the syncing is much faster): syncing_fixed

Closes #3404

kpandl avatar Oct 03 '24 13:10 kpandl