Implement execution witness without unwinding (i.e. resurrect `eth_getWitness`)
Rationale
Today's zkVMs rely on execution witnesses as inputs to their provers. This is now mostly happening using Reth's debug_executionWitness RPC endpoint. With Reth however, this only works well for the latest block and for up to several thousand blocks behind the tip (beyond a certain point there is too much of a performance overhead). An Erigon 3 archive node with commitment history enabled, could provide execution witnesses for any block at any point of time.
Implementation
A good starting basis for the implementation is the experimental eth_getWitness endpoint, which is now dormant and hasn't been maintained for some time, but could be resurrected.
The witness format used by Erigon uses CBOR serialization, instead of Reth's RLP, but changing the serialization should be straightforward.
The format used by Geth and Reth for the response of debug_executionWitness is :
{
"state": [
"0xf90211a0e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855a0...",
"0xf851a0ab83..."
],
"codes": [
"0x6080604052348015600f57600080fd5b506004361060285760003560e01c8063..."
],
"keys": [
"0xd8da6bf26964af9d7eed9e03e53415d37aa96045",
"0x0000000000000000000000000000000000000000000000000000000000000000"
],
"headers": [
"0xf90218a0.."
]
}
where
- state contains the list of RLP encoded trie nodes in the witness trie. These come in an arbitrary order, however the order of traversal can be inferred during runtime by building a map
keccak(nodeRlp) => decode(nodeRlp)and starting the traversal using the root hash of the MPT as the key in the lookup. - codes is the list of acessed/created bytecodes during block execution.
- keys is the list of account and storage keys accessed/created during execution.
- headers is a list of RLP-encoded block headers needed for
BLOCKHASHopcode support. (I don't know yet if Erigon supports tracking block hashes accessed. Reth keeps track of the lowest block referenced byBLOCKHASHduring execution and providing headers from this block number until the block number in the request)
The eth_getWitness endpoint works pretty much the same way as Reth's debug_executionWitness , by unwinding to the parent of the desired block, executing the block ephemerally while recording any state accesses (accounts, storage slots, code reads/updates/deletes), and finally generating merkle proofs for each of these accesses.
To enable proofs for any block these changes are needed to the code:
-
Blocks need to be executed by using a historical state reader, instead of unwinding. This should not be to difficult as there are already other RPC calls that do this. E.g.
debug_traceBlockByNumber. For the witness we just need custom state writers to record state accesses into our desired internal structure. -
We need to keep track of accessed block hashes. This can be done by implementing a custom
evmtypes.BlockContext.GetHashfunction that will record any accessed block hash. -
Handle edge cases particular to
eth_getWitnessin the trie traversal. While most edge cases (e.g. non-existent keys) are handled by the traversal algorithm used ineth_getProofwhich is currently fully working, there is a certain edge case which needs special handling for the execution witness.
Edge cases
While most tricky examples and edge cases can be handled by the code of eth_getProof in the same way, there are certain cases where the eth_getWitness witness trie needs to reveal more information about the nodes.
1. Branch node with 2 children collapsing to extension node after 1 child gets deleted
This is a case where we have a branch node with exactly 2 children and 1 of them gets deleted during execution of the block. In eth_getProof the witness trie would provide the last node as the branch node with the accessed child revealed and its sibling not revealed (provided as hash only). For the execution witness both children should be revealed, since the information in the sibling may be needed in case the collapsed branch node (which would be converted to an extension node) needs to be merged with its child and/or parent.
2. Branch node with m children where m-1 children get deleted
This is a more general case of edge case 1, where all but 1 of the children of a branch node get deleted.
Tackling the edge cases
The challenge with the above edge cases is that the single surviving child is not part of the access list from execution, so it doesn't get recorded there. It only gets accessed during merkle root calculation when the state updates are supposed to be applied to the trie to calculate the new root.
Approach : Simulate the trie updates and trace resolved nodes
The approach Geth follows is to use a tracer (prevalueTracer) that records every trie node touched during a simulation of the new root calculation. This would require substantial work, all the node resolution points need to be clearly distinguished and the tracer hook invoked in each such point, but ultimately, this is the only approach that would work, since the collapses could cascade to higher levels in the trie, and clever workarounds around the reading the pre-execution trie without simulating MPT updates would fail (either due to giving an incomplete witness, or a witness redundant nodes) because of cascading deletes.