lighthouse icon indicating copy to clipboard operation
lighthouse copied to clipboard

Improve range sync with PeerDAS

Open dapplion opened this issue 1 year ago • 4 comments
trafficstars

Roadmap

  • Downscore peers for invalid data https://github.com/sigp/lighthouse/pull/7352
    • Track who send what in batches. We need change PeerId for PeerGroup which allows to track who send each column
      • Partially implement in https://github.com/sigp/lighthouse/pull/6922 which removes the tracking a PeerId -> request inside the batches
    • Make beacon processor errors more descriptive such that sync can know which column caused the RpcBlock to be invalid
  • Downscore peers for custody failures
    • Send requests to peers that are part of the SyncingChain peerset, instead of the global pool. This will cause SyncingChains to error frequently with NoPeer errors
      • Modify syncing batches to allow them to stay in the AwaitingDownload state when they have no peers
      • Remove good_peers_on_sampling_subnets
      • Extras:
        • Consider adding a fallback mechanism where we fetch from the global peer set, but only for those that are synced up to the requested batch. And in that case don't penalize custody failures
        • Assume all finalized peers are in the same chain and improve SyncingChain grouping
        • Implement StatusV2
    • Change by_range sync download algorithm to fetch blocks first, then columns. Use the blocks are source of truth to match against columns and penalize custody failures.
      • V1: Assume block peer is honest, and believe the blocks are canonical
      • V2: Send blocks to the processor to verify correct proposer index and proposer signature (significantly increases the cost of an attack). Note: this will require way more complex logic, similar to lookup sync. Note^2: batch syncing will no longer download and process in parallel. Because we need to processing is sequential and we need to process blocks before completing a batch download.
  • Request invidual columns when a peer fails to serve the columns
    • To retry each by_range request individually we need to be able to trigger them individually https://github.com/sigp/lighthouse/pull/6497
    • Add a custody_by_range meta-request similar to beacon_node/network/src/sync/network_context/custody.rs
    • Deprecate current block_sidecar_coupling logic for a meta-request similar to lookup sync or custody by root.
  • Reconstruct if we can't download all columns that we need but we have >= 50%
  • Improve peer selection logic: what peer to select next for column requests? i.e. if a peer has a custody failure, should we never request from it, prevent requests to it for some time?

Extras

  • Refactor verify_kzg_for_rpc_blocks outside of the da_checker, it does not use the cache
  • Change RpcBlock to holding a Vec of DataColumnSidecars so we don't need a spec reference

Why requesting individual requests is useful

Currently range sync and backfill sync fetch blocks and blobs from the network with this sequence:

  • Out of the pool of peers in the chain (peers that agree on some fork-choice state) select ONE peer
  • Immediately issue block_by_range and blobs_by_range to the SAME ONE peer
  • If any of those requests error, fail BOTH requests and retry with another peer

This strategy is not optimal but good enough for now. However with PeerDAS, the worst-case number of requests per batch increases from 2 (blocks + blobs) to 2 + DATA_COLUMN_SIDECAR_SUBNET_COUNT / CUSTODY_REQUIREMENT = 2+32 (if not connected to any larger node.

If we extend the current paradigm, a single failure on a columns_by_range request will trigger a retry of all 34 requests. Not optimal 😅

A solution is to make the "components_by_range" request able to retry each individual request. This is what block lookup requests do, where each component (block, blobs, custody) has its own state and retry count.

dapplion avatar Aug 14 '24 16:08 dapplion

Going that direction would add a bunch of lines to sync, first of all I want to check if you agree with the problem, and that it's actually necessary to make each request retry-able.

This does seem necessary to me, also because requesting all custody columns from the same peer means you have to find a peer that exactly matches your custody right? so retries would happen frequently

realbigsean avatar Aug 14 '24 20:08 realbigsean

+1 to the above.

One of the bugs we had earlier on das was that we were spamming blocks requests before we have enough custody subnet peers - so we never make data column requests but we made a bunch of block requests before getting rate limited - and we had to add a workaround to make sure we have peers across all custody subnets before we start requesting both (#6004).

The proposed change here would allow us to start requesting blocks and columns without having to wait for peers to be available across all custody subnets (for supernodes that would mean requests would be delayed until it has peers across all 128 subnets!).

jimmygchen avatar Aug 15 '24 01:08 jimmygchen

Noting another issue with backfill sync:

Backfill sync sources blocks and blobs / data columns from peers with the _by_range RPCs. We bundle those results in an RpcBlock and send it to the processor. Since now the block and column peers may be different we need to attribute fault to the right peer.

Currently process_backfill_blocks checks KZG validity before checking the hash chain. If the block peer sends and invalid block we may hit a KZG or availability error instead of a block hash error. We should check the integrity of the block before checking the columns validity.

Change the function to

fn process_backfill_blocks() {
    check_hash_chain(downloaded_blocks)?;
    check_availability(downloaded_blocks)?;
    import_historical_block_batch(downloaded_blocks)?;
}

Where import_historical_block_batch may no longer need to check that hash chain as it's done ahead of time.

dapplion avatar Oct 10 '24 17:10 dapplion

Some extra notes for

Assume all finalized peers are in the same chain and improve SyncingChain grouping

Problem:

  • Peers are scattered between different syncing chains if not everyone is synced to the same target

Implementation:

  • Ignore finalized_root
  • Clamp finalized_epoch to our head + SOME_DISTANCE
    • If peers are advanced enough, they will group into a unique SyncingChain
    • If a peer is < SOME_DISTANCE ahead of us, it will be added on its own SyncingChain and not used

Current stable

fn get_target(remote_status) {
    Target {
        epoch: remote_status.finalized_epoch,
        root: remote_status.finalized_root,
    }
}

Proposed impl

fn get_target(remote_status) {
    let next_local = local_state.finalized_epoch - local_state.finalized_epoch % 1024 + 1024
    Target {
        epoch: max(remote_status.finalized_epoch, next_local),
        root: Hash256::ZERO,
    }
}

dapplion avatar May 10 '25 01:05 dapplion

I've added numbers to the main bullet points above so we can reference them.

I believe 1-3 and 5 have been addressed.

For (4) reconstruct during sync, I don't think we should do it for the following reasons

Rethinking about this again - i don't feel like there's much value in this, pawan agrees with me.

On the surface, it sounds like a good fallback to use during sync to make sure we don't get stuck - but why would we even do this and not find good peers instead?

If the data was indeed available, good honest peers MUST be able to serve them.

If we're struggling to find peers, then it's another problem.

I don't see how reconstruction on sync helps the node or the network. It's also very expensive at high blob count and will impact the node from following the head.

jimmygchen avatar Aug 12 '25 02:08 jimmygchen

I feel like we can close this issue now, as we have other outstanding peerdas sync issues tracked separately:

  • https://github.com/sigp/lighthouse/issues/7704
  • https://github.com/sigp/lighthouse/issues/7603

@dapplion @pawanjay176 what do you think?

jimmygchen avatar Aug 12 '25 02:08 jimmygchen

We haven't done the following which I think we should come back to when we have more time:

  • Modify syncing batches to allow them to stay in the AwaitingDownload state when they have no peers
  • Assume all finalized peers are in the same chain and improve SyncingChain grouping
  • Change by_range sync download algorithm to fetch blocks first, then columns. Use the blocks are source of truth to match against columns and penalize custody failures. V2: Send blocks to the processor to verify correct proposer index and proposer signature (significantly increases the cost of an attack). Note: this will require way more complex logic, similar to lookup sync. Note^2: batch syncing will no longer download and process in parallel. Because we need to processing is sequential and we need to process blocks before completing a batch download. I'm not sure if we should spend time on this v/s tree sync cc @dapplion

pawanjay176 avatar Aug 12 '25 02:08 pawanjay176