lighthouse icon indicating copy to clipboard operation
lighthouse copied to clipboard

Reconstruct data columns without blocking processing and import

Open jimmygchen opened this issue 1 year ago • 7 comments

Issue Addressed

Part of #4983. This is a work in progress.

The current sequence of reconstruction:

  1. Receive blocks & data columns
  2. As soon as the supernode receives 50% of columns, start reconstruction (while holding the availability cache write lock)
  3. After reconstruction completes, the node imports block and publishes the remaining 50% of columns
  4. Now it may also receives the remaining 50%, which gets verified and then ignored

Proposed Changes

  • Changes to data column reconstruction:
    • Attempt reconstruction without holding availability cache lock, so we can process other gossip / rpc data columns simultaneously
    • Check availability cache again before publishing reconstructed columns to avoid publishing excess duplicates
  • Remove some unnecessary RuntimeVariableList conversion for data columns

Additional Info

Below are logs from local testing, notice that the reconstruction blocks processing of incoming data columns (due to holding the availability cache write lock), and they gets ignored as duplicates:

Jun 24 08:01:15.781 DEBG Successfully verified gossip data column sidecar, index: 63, block_root: 0x690d…0e3e, slot: 40556
Jun 24 08:01:16.280 DEBG Reconstructed columns                   count: 64, service: availability_cache, service: beacon
Jun 24 08:01:16.281 DEBG Writing data_columns to store           count: 128, block_root: 0x690d…0e3e, service: beacon
Jun 24 08:01:16.284 INFO Gossipsub data column processed, imported fully available block, block_root: 0x690d…0e3e
Jun 24 08:01:16.284 DEBG Sending pubsub messages                 topics: [DataColumnSidecar(DataColumnSubnetId(0)), DataColumnSidecar(DataColumnSubnetId(1)), DataColumnSidecar(DataColumnSubnetId(2)) .. DataColumnSidecar(DataColumnSubnetId(31))], count: 64, service: network
Jun 24 08:01:16.314 DEBG Successfully verified gossip data column sidecar, index: 64, block_root: 0x690d…0e3e, slot: 40556
Jun 24 08:01:16.314 DEBG Ignoring gossip column already imported, data_column_index: 64, block_root: 0x690d4885fe843d085464b4f36b774597525d2dbddcd26ceca95fd4fdf4930e3e
Jun 24 08:01:16.323 DEBG Successfully verified gossip data column sidecar, index: 65, block_root: 0x690d…0e3e, slot: 40556
Jun 24 08:01:16.323 DEBG Ignoring gossip column already imported, data_column_index: 65, block_root: 0x690d4885fe843d085464b4f36b774597525d2dbddcd26ceca95fd4fdf4930e3e
Jun 24 08:01:16.332 DEBG Successfully verified gossip data column sidecar, index: 66, block_root: 0x690d…0e3e, slot: 40556
Jun 24 08:01:16.332 DEBG Ignoring gossip column already imported, data_column_index: 66, block_root: 0x690d4885fe843d085464b4f36b774597525d2dbddcd26ceca95fd4fdf4930e3e
Jun 24 08:01:16.341 DEBG Successfully verified gossip data column sidecar, index: 67, block_root: 0x690d…0e3e, slot: 40556
Jun 24 08:01:16.341 DEBG Ignoring gossip column already imported, data_column_index: 67, block_root: 0x690d4885fe843d085464b4f36b774597525d2dbddcd26ceca95fd4fdf4930e3e
Jun 24 08:01:16.349 DEBG Successfully verified gossip data column sidecar, index: 68, block_root: 0x690d…0e3e, slot: 40556
Jun 24 08:01:16.349 DEBG Ignoring gossip column already imported, data_column_index: 68, block_root: 0x690d4885fe843d085464b4f36b774597525d2dbddcd26ceca95fd4fdf4930e3e
Jun 24 08:01:16.358 DEBG Successfully verified gossip data column sidecar, index: 69, block_root: 0x690d…0e3e, slot: 40556
Jun 24 08:01:16.358 DEBG Ignoring gossip column already imported, data_column_index: 69, block_root: 0x690d4885fe843d085464b4f36b774597525d2dbddcd26ceca95fd4fdf4930e3e
Jun 24 08:01:16.367 DEBG Successfully verified gossip data column sidecar, index: 70, block_root: 0x690d…0e3e, slot: 40556

jimmygchen avatar Jun 25 '24 04:06 jimmygchen

⚠️ The sha of the head commit of this PR conflicts with #5986. Mergify cannot evaluate rules on this PR. ⚠️

mergify[bot] avatar Jun 25 '24 04:06 mergify[bot]

Thanks for the review! I've addressed the comments and will start local testing.

jimmygchen avatar Jun 28 '24 04:06 jimmygchen

Thanks for the review! Addressed your comments in https://github.com/sigp/lighthouse/pull/5990/commits/7d2d826a892f46dd965484a1ecc49401db57ba54.

jimmygchen avatar Jul 01 '24 03:07 jimmygchen

Generally looks good to me! Lots of conflicts, tho

Do you want to port this to unstable?

dapplion avatar Aug 16 '24 08:08 dapplion

Thanks, I've just tried to resolve conflicts, but unfortunately I see this is going to conflict a lot with #6268, and it might be easier to implement it once that one is merged, as it has quite a bit of common logic and changes. Since this one is very outdated, I think it make sense to just create a new PR, using the branch from #6268 as base branch.

jimmygchen avatar Aug 19 '24 12:08 jimmygchen

Leaving this PR open so we don't forget this.

jimmygchen avatar Aug 22 '24 12:08 jimmygchen

Please update to point at unstable by either:

  1. Rebasing on unstable (if your branch has a small number of commits that are easy to tease out), or
  2. Merging origin/das into this PR: git fetch origin; git merge origin/das. This will result in the smallest number of conflict resolutions and is better for branches that already contain merge commits or have extensive history.

michaelsproul avatar Aug 27 '24 05:08 michaelsproul

Closing in favour of #6403

jimmygchen avatar Sep 17 '24 04:09 jimmygchen