zkevm-specs icon indicating copy to clipboard operation
zkevm-specs copied to clipboard

Executed bytecode source for opcode lookup

Open han0110 opened this issue 3 years ago • 4 comments

In EVM there are 3 different sources for executed bytecode:

  1. When contract interaction, it's from contract's bytecode
  2. When contract creation in root call, it's from transaction's calldata
  3. When contract creation in internal call, it's from caller's memory

To avoid too much complexity on EVM circuit, we try to remove 3. by explicitly copying the caller's memory to bytecode_table, since for CREATE2 we already need to hash the initcode, so we abuse this for also CREATE.

Then the sources now are either 1. or 2., but there is another thing EVM circuit desires for:

For executed bytecode, EVM circuit expect executed bytecode to be analyzed and annotated if it's is_code on every byte, for verifying JUMP* in a single-step way, otherwise every time when it comes to JUMP*, it needs to look back to check if it's is_code or not, and this could be multi-steps.

So there are at least 2 options here to support creation transaction:

  1. EVM circuit lookup sources 1. and 2. by condition, and implement is_code annotation on both bytecode circuit and tx circuit.

    This seems reasonable but tx circuit has not been explored too much yet. Although it brings some implementation redundancy, the is_code annotation should not be too costly to do.

  2. EVM circuit lookup sources 1. always, and explicitly copying transaction's calldata to bytecode_table when it's a creation one.

    This seems to bring more loading to bytecode_table at first glance, but if a DOS attacker wants to blow up the bytecode_table, it would rather use EXTCODECOPY to read different huge contracts, which only costs a constant 2600 and generates 0x6000/2600 ≈ 9.45 bytes per gas in worst case. So it seems also a reasonable approach to avoid implementation redundancy and make the executed bytecode single source for EVM circuit.

Any feedbacks are appreciated.

han0110 avatar Dec 04 '21 07:12 han0110

I am currently addressing the review comments on https://github.com/appliedzkp/zkevm-specs/pull/191, and as @ed255 pointed out, it is important to finalise the definition of StepState.code_source in the specs.

@han0110 My opinion on the above issue is to use the bytecode table as the single source even for contract creation, specifically, this:

EVM circuit lookup sources 1. always, and explicitly copying transaction's calldata to bytecode_table when it's a creation one.

This will also reduce implementation complexity and any chances of possibly missing the is_create cases for any *CODE* related opcodes.

From #191 's perspective, I believe I need to populate the bytecode table for the above mentioned cases and document those changes appropriately. Would be nice to know what others think about these changes and whether or not they should be made as a separate PR. Thanks!

roynalnaruto avatar May 09 '22 05:05 roynalnaruto

Replying as an assigned reviewer for #191. I am not very familiar with the details, but 2. option seems cleaner to me too.

miha-stopar avatar May 09 '22 09:05 miha-stopar

I also prefer option 2 (single source of bytecode: the bytecode table) because it unifies the way bytecode is accessed (and I think this simplifies the design). Specially considering that we will have a copy circuit where we can offload copying memory to bytecode table and tx.call_data to bytecode table.

ed255 avatar May 09 '22 10:05 ed255

As discussed in this week's call, let's implement option 2 for simplicity of EVM circuit. But for the Copy circuit (#194), it then needs to handle the copy from tx calldata in tx_table to bytecode_table in the future (or we can also implement this in Tx circuit directly), and such copy will be triggered by EVM circuit in BeginTx if it has is_create == True.

The rest todos to resolve this issue:

  • EVM circuit
    • [x] Rename code_source to code_hash
    • [ ] (maybe) Remove is_create and is_root from tracking state and lookup them when necessary
    • [ ] In creation tx, lookup Copy circuit to make sure the init code is copied into bytecode_table and get the code_hash
  • Copy circuit (or Tx circuit)
    • [x] Implement copy from tx calldata in tx_table to bytecode_table

han0110 avatar May 13 '22 07:05 han0110

I believe we can consider this issue resolved as we already have CREATE in the specs https://github.com/privacy-scaling-explorations/zkevm-specs/pull/355 @han0110 since you reviewed the CREATE spec PR, can you confirm if it aligns with this discussion https://github.com/privacy-scaling-explorations/zkevm-specs/issues/73#issuecomment-1125754364 ? If so, we can then close this issue :)

ed255 avatar May 12 '23 11:05 ed255

For creation transaction it's not yet implemented, but the concept is similar, so yeah we can close this issue.

han0110 avatar May 15 '23 02:05 han0110