py-evm icon indicating copy to clipboard operation
py-evm copied to clipboard

`VM.validate_header()` is called twice during `Chain.import_block()`?

Open hwwhww opened this issue 7 years ago • 5 comments

What is wrong?

The current calling route:

class Chain(BaseChain):
    def import_block(self,
                     block: BaseBlock,
                     perform_validation: bool=True
                     ) -> Tuple[BaseBlock, Tuple[BaseBlock, ...], Tuple[BaseBlock, ...]]:
        ....
        base_header_for_import = self.create_header_from_parent(parent_header)
        imported_block = self.get_vm(base_header_for_import).import_block(block)
        # ^^^^^^ dive into VM.import_block()

        # Validate the imported block.
        if perform_validation:
        # ^^^^^^ perform_validation is True by default.
            validate_imported_block_unchanged(imported_block, block)
            self.validate_block(imported_block)
        ....

    def validate_block(self, block: BaseBlock) -> None:
        if block.is_genesis:
            raise ValidationError("Cannot validate genesis block this way")
        VM = self.get_vm_class_for_block_number(BlockNumber(block.number))
        parent_block = self.get_block_by_hash(block.header.parent_hash)
        VM.validate_header(block.header, parent_block.header, check_seal=True)
        # ^^^^^^ second time call VM.validate_header!
        ....
class VM(BaseVM):
    def import_block(self, block):
        ....
        return self.mine_block()

    def mine_block(self, *args, **kwargs):
        ....
        # Perform validation
        self.validate_block(final_block)

        return final_block

    def validate_block(self, block):
        ....
        if block.is_genesis:
            validate_length_lte(block.header.extra_data, 32, title="BlockHeader.extra_data")
        else:
            parent_header = get_parent_header(block.header, self.chaindb)
            self.validate_header(block.header, parent_header)
            # ^^^^^^ first time calling VM.validate_header! check_seal is True by default.
        ....

How can it be fixed

Is Chain.import_block the main API for syncing? I guess perform_validation is for testing. Can we remove the second call and set the first call check_seal=True?

hwwhww avatar Oct 19 '18 09:10 hwwhww

  1. I recall perform_validation being for testing. At the time we couldn't figure out a better solution.
  2. I suspect one of these can be removed but it isn't clear to me which one and how because I think what we're seeing is multiple different use cases (probably 3), where two of them only hit that validation once, but import_block hitting both.

It'd be ideal to fix this, but I think it'll require some experimentation to figure out exactly how.

pipermerriam avatar Oct 19 '18 17:10 pipermerriam

It looks like so far they are both used only privately and a couple places.

AFAICT, the tension comes from these two goals:

  1. We don't want VM.import_block to just work if the block is invalid, and have to check it later.
  2. A full validate_block has to be done at the chain level, because you need Chain to validate the uncles.

So VM ends up doing a partial check, and relies on chain to do the rest. Which leads to messy buggy code that caused this issue.

Maybe importing a block should become only a chain level thing and that's what calls to the VM's API to accomplish the import. Which I guess is potentially loosening goal 1. I think that's okay if we remove the name import_block and replace it with something that doesn't sound like it does validation, like rebuilt_block = VM.rebuild_block(original_block). It wouldn't have side-effects or do any validations, just regenerate the final state according to the VM's rules. Then the caller would do the necessary validation, and persisting (which it already does...).

carver avatar Oct 19 '18 19:10 carver

A full validate_block has to be done at the chain level, because you need Chain to validate the uncles.

@pipermerriam off-topic question: for beacon chain, do we have to implement validate_block in BeaconChain to be aligned with Chain, although beacon chain doesn't have uncles? Currently the process_block function will validate_attestations.

hwwhww avatar Oct 22 '18 10:10 hwwhww

@hwwhww 1) I'm not sure I understand your question and 2) BeaconChain should not try to be shaped like Chain in any case where it doesn't fit well. It's better for BeaconChain to be properly architected than for it to have API parity with Chain.

pipermerriam avatar Oct 22 '18 15:10 pipermerriam

@pipermerriam whoops sorry, my question was do I have to extract the block validation from BeaconStateMachine.import_block and BeaconStateMachine.process_block, and make BeaconChain trigger the block validation. Your second response answered my question. 👍

hwwhww avatar Oct 22 '18 15:10 hwwhww