subnet-evm icon indicating copy to clipboard operation
subnet-evm copied to clipboard

Add option for backfilling blocks (for nodes that have performed state sync)

Open darioush opened this issue 2 years ago • 6 comments

Context and scope Networks may desire to use state sync (for the benefit of speed in joining network and smaller database size), yet also desire to maintain full block history (eg, for data availability and preserving ability to recreate state from genesis). This ticket is to add an option that would allow a node that does not have full block history to request blocks from its peers without executing those blocks. This process should run in the background with minimal overhead for nodes downloading blocks and nodes serving blocks to their peers.

Subnet-evm currently supports serving blocks to peers (this is used in the last step of state sync to fetch the most recent 256 blocks from peers as their hashes are accessible to application code via BLOCKHASH opcode).

Relevant code:

  • Request handling code
  • Example request making code

This feature should also be added to coreth.

Discussion and alternatives [TODO: add any relevant alternatives, and document limitations of the approach. Notably using the proposed alternative above does not improve data availability for bootstrapping, since the proposerVM header / block is not synced.]

Open questions [TODO: add any open questions]

darioush avatar Oct 03 '23 16:10 darioush

Should this be supported from the ProposerVM?

aaronbuchwald avatar Oct 03 '23 16:10 aaronbuchwald

@aaronbuchwald, @darioush a clean way to introduce this feature is the following:

  1. Extend StateSyncableVM interface to introduce methods to
    • Check is the VM wants to backfil blocks when state sync is done (so this will become a VM config), and if so, what block should we start backfilling from (it should be last accepted block normally, but could be a lower block in case of vm restart).
    • Provide the VM the downloaded blocks to backfil. This let us avoid block.Accept calls, and use an ad hoc method to parse and store the block in whatever state and indexes the VM has without accepting them
  2. Only when state sync is done, and the engine is notified that state sync is done, we can start downloading blocks to backfill with GetAncestor call. GetAncestors call is currently used during bootstrapping, so we may need to defer blocks backfilling to when boostrap is done to avoid mixing requests. I think this meet expectation that VM should state sync as fast as possible and only then it should work on backfilling history while accepting new blocks.

I would avoid the solution where the VM issues blocks requests, shortcirtuiting the proposerVM as we may end up breaking some of the invariants that the VM. Now sure if @StephenButtolph has visiblity/opinions on this.

abi87 avatar Oct 05 '23 13:10 abi87

I think extending StateSyncable (or a new interface) is a good idea, since we have used this method for the VM to signal the engine it has some optional capabilities. Perhaps the VM can return the range of blocks it wishes to backfill.

I agree that using the engine is preferable to the VM requesting, since as you mention the VM requesting approach has the following cons:

  • Each VM will need to implement this
  • ProposerVM style wrapping will not work and as a result this will not help data availability with bootstrapping.

We should allow the backfill to occur whether the VM has just finished state syncing or even if state syncing is disabled (eg, a VM may turn on state sync to join the network, then turn it off for later but I think we should still be able to backfill in this case)

darioush avatar Oct 05 '23 15:10 darioush

@darioush I am thinking to adding this methods to the StateSyncableVM interface:

  1. BackfillBlocksEnabled(ctx context.Context) (ids.ID, error) returns whether VM wants to backfil and from which blockID it needs to start
  2. BackfillBlocks(ctx context.Context, blocks [][]byte) error to pass downloaded blocks. Error can be used to signal that VM wants to stop (with a special error flag), not just that there were some errors in parsing downloaded blocks.

It's a bit weird to backfil in case state sync is not enabled? I view backfilling as a completion of the state sync process rather than as a stand alone feature: non state syncable VMs should not be provided with backfilled blocks, and I guess neither should VM whose state sync is disabled? We can discuss about this

abi87 avatar Oct 05 '23 16:10 abi87

Could we switch from using a special case error to returning a boolean to indicate backfill should stop?

Other than that, I think this is the right approach.

We should keep the interface simpler and support a single starting point maintained by the VM to backfill from rather than asking for multiple block ranges to keep things simple and if the VM ends up having part of that range filled in, it can skip over handling those blocks within BackfillBlocks or we can revise BackfillBlocks to support signaling that it should skip a certain number of blocks.

aaronbuchwald avatar Oct 06 '23 20:10 aaronbuchwald

Related issue: https://github.com/ava-labs/avalanchego/issues/2345

ceyonur avatar Jan 08 '24 10:01 ceyonur