polkadot-sdk icon indicating copy to clipboard operation
polkadot-sdk copied to clipboard

Rebroadcast blocks before importing them to reduce the hop latency.

Open eskimor opened this issue 4 years ago • 5 comments

@rphmeier suggested this as an optimization. Apparently we have this on Ethereum already, thanks to @tomusdrw.

eskimor avatar Jan 18 '21 16:01 eskimor

Note that the Ethereum implementation is verifying PoW before re-broadcasting, in substrate similarly we should perform external consensus checks (like BABE signature verification) before sending further.

tomusdrw avatar Jan 18 '21 17:01 tomusdrw

I talked to @gnunicorn, @tomusdrw and @tomaka about this issue and here's what i gather to start working on this issue:

Currently on Substrate:

  • A block announce is received
  • Preliminary checks / validations are done
  • If valid, the block is downloaded
  • Another step if validation on the downloaded block is done
  • Transactions are executed in the runtime
  • Block is propagated.

The work to be done involves propagating the block after the preliminary checks and block download.

rakanalh avatar Jan 28 '21 12:01 rakanalh

@rakanalh Yes, that sounds accurate. We want all GRANDPA & BABE checks to be done, including those that are done in their respective ImportBlock implementations.

rphmeier avatar Jan 28 '21 19:01 rphmeier

@rphmeier just to confirm, you mean you want all consensus checks to be done before propagating? My initial thought was that we want to only pass the DefaultBlockAnnounceValidator validation step, download the block and then propagate the block and only then validate through consensus and execute transaction.

So my plan is more like: BlockAnnounce -> DefaultBlockAnnounceValidator -> Propagate block -> Consensus checks -> Block runtime / transaction execution.

rakanalh avatar Jan 28 '21 19:01 rakanalh

just to confirm, you mean you want all consensus checks to be done before propagating?

Yes.

Quoting @tomusdrw from above:

in substrate similarly we should perform external consensus checks (like BABE signature verification) before sending further.

The main expensive thing we do with the block is actually executing the block and writing the state to DB. That's what we need to avoid.

rphmeier avatar Jan 28 '21 19:01 rphmeier

We are currently reviving this issue to get it done. The comments here are quite old.

In general, the header verification should be based only on the header and avoid costlier operations. Discussed with @bkchr in the past the following issues:

  1. We should reorganize the block import pipelines a bit. Verification should work mostly on the header and avoid state access. IMO this step can be done iteratively for the different importers we have in the repository. One offender is the inherent check which can be moved from verification to the import stage, work started in #8446. While catching up on the discussion here and looking at the code, I realized that we have, for example, the equivocation checks also in the verification phase: https://github.com/paritytech/polkadot-sdk/blob/3100de0ecdc7e1c09ca7f962a05d993475a7506f/substrate/client/consensus/babe/src/lib.rs#L990-L990. To me it would make sense to keep them there, as it does not make sense to distribute equivocating blocks.
  2. Then move the announcement to the post-verification, pre-import phase.

What we were not discussing was the ordering problem of announcing blocks that we have not yet imported, as the storing of block bodies only happens at the end of import: https://github.com/paritytech/polkadot-sdk/blob/c4ba6e0288a127f7f452e09e872960837aa313c0/substrate/client/db/src/lib.rs#L1515

So we cannot easily deliver these blocks directly after announcement.

This problem is discussed in a previous attempt to resolve this issue. This requires some investigation on what the current status is. AFAIK, not much changed with respect to this, and we need to come up with a solution. @sistemd please take a look at the previous PR and assess what was attempted there.

@bkchr if you have further opinions and insights, now is a good time to drop them here :D.

skunert avatar May 14 '25 08:05 skunert

  1. While catching up on the discussion here and looking at the code, I realized that we have, for example, the equivocation checks also in the verification phase:

We want to forward blocks asap. I would do these steps as part of the import.

Generally how I would like to see this implemented:

  • Split Verification and Import (which is already being worked on)
  • Verification is directly called by the sync code, no forward to the import queue, maybe discussable to put it into some background task
  • If the verification is successful, we cache the header in the sync code, request the block bodies and re-announce the header to the connected nodes
  • Incoming block bodies requests for blocks that we are still downloading, will be cached as well inside the sync code
  • When the block bodies is downloaded, we cache the block bodies in the sync code, we start forwarding it to the cached requests and import the block in the background
  • When the block is imported, we remove the data & header from the cache.

There should be some reasonable upper limit for block bodies and headers we keep cached, I would say 10 (10 headers and 10 block bodies).

bkchr avatar May 14 '25 20:05 bkchr

As discussed today, to implement this in AURA, we first need to implement node side authority list tracking. After this is done the shared authority list on the node side would be passed to the Verifier and this way it can validate the seal without calling into the runtime to fetch the authority list.

bkchr avatar Jul 09 '25 12:07 bkchr