ethereumjs-monorepo
ethereumjs-monorepo copied to clipboard
Trie: implement createRangeProof v4
See: #2672 (this PR is messed up), and commit refs (this PR has them all squashed) at #2820
This PR implements Trie class method: createRangeProof.
createRangeProof
-
params:
- startingKey: Uint8Array
- limitKey: Uint8Array
-
returns
- keys: Uint8Array[ ]
- values: Uint8Array[ ]
- proof: Uint8Array[ ] | null
details
- Lookup path to
startingKey
- If no node exists for
startingKey,- If starting key === '0x000...'
- return
AllElementsRangeProof
- return
- else
- Walk back up through path to find the first value node with a key higher than the
startingKey - Treat this node as the new starting point
- Walk back up through path to find the first value node with a key higher than the
- If starting key === '0x000...'
- Lookup path to
limitKey
- If no node exists for
limitKey- Walk back up through path to find the first value node with a key higher than the
limitKey - Treat this node as the new range limit
- If no node is found, the range effectively becomes all nodes to the right of
startingKey
- Walk back up through path to find the first value node with a key higher than the
-
Walk the section of the Trie between
startingNodeandlimitNode, collecting thekeysandvalues, and the total set of nodes included in the proofs for each of them. -
Return
RangeProofobject:
keys: Uint8Array[]
values: Uint8Array[]
proof: Uint8Array[]
If 0 value nodes exist between startingKey and limitKey, return zeroElementsProof
{keys: null, values: null, proof: null}
If 1 value node exists between startingKey and limitKey, return oneElementProof
{keys: [ node.key ], values: [ node.value ] , proof: trie.createProof(node) }
If startingKey = 0x000...000 and limitKey = 0xFFF...FFF, return allElementsProof
{keys: [ *all keys* ], values: [ *all values* ], proof: null}
Codecov Report
Merging #2821 (0f75392) into master (466b6f2) will decrease coverage by
0.01%. The diff coverage is84.59%.
Additional details and impacted files
| Flag | Coverage Δ | |
|---|---|---|
| block | 88.66% <ø> (ø) |
|
| blockchain | 92.58% <ø> (ø) |
|
| client | 87.55% <ø> (-0.11%) |
:arrow_down: |
| common | 98.65% <ø> (ø) |
|
| ethash | ∅ <ø> (∅) |
|
| evm | 69.93% <ø> (ø) |
|
| rlp | ? |
|
| statemanager | 82.98% <ø> (ø) |
|
| trie | 89.74% <84.59%> (+0.10%) |
:arrow_up: |
| tx | 95.96% <ø> (ø) |
|
| util | 86.77% <ø> (ø) |
|
| vm | 79.20% <ø> (ø) |
Flags with carried forward coverage won't be shown. Click here to find out more.
Before I continue locally, 2ce929be24b56e64422d6f43962b679005e61270 is a commit which passes the (old) tests.
Not included: empty proofs where trie still has a key with value right of the limitHash (this is a prerequiste for snap sync: always report at least one key-value, or none for a non-existence proof of items right of limitHash)
Code is super messy, and I have to look into if I have to use nibblesCompare instead of compareBytes (likely: yes).
Also not convinced that my code for "find value right of this key" is always correct, I think there are trees with certain inputs that it reports no value while there are values left :thinking:
@jochem-brouwer
i'm a little confused by the comments, and labeling here (v4...?) happy to continue work on it, i just want to understand the status so i work on the right thing :smirk:
This v4 is the fourth time this PR has been cherry-picked over others.
The comment about "logic of finding right key" is such that, if one requests a range of nodes, and no nodes are found, then in the snap protocol the key right of the "limitHash" key should be returned. Thus we need logic such that given key X, one returns the key (plus value) of the key right of X (so with key higher than X).
The other logic is about using nibblesCompare or compareBytes (these methods are slightly different) to make the logic correct.
Thus we need logic such that given key X, one returns the key (plus value) of the key right of X (so with key higher than X).
:+1: i can tinker around with a couple approaches to that, and see what works well.
a couple questions
- are we trying to implement this generally for MPT with keys of any size? or can we assume these will be hashed 32 bytes keys?
- do we just need to return the key/value? or will we ultimately want the path/proof as well?
Thus we need logic such that given key X, one returns the key (plus value) of the key right of X (so with key higher than X).
👍 i can tinker around with a couple approaches to that, and see what works well.
a couple questions
- are we trying to implement this generally for MPT with keys of any size? or can we assume these will be hashed 32 bytes keys?
- do we just need to return the key/value? or will we ultimately want the path/proof as well?
-
Regarding this: this is non-trivial. If we have keys of equal size (e.g. 32 bytes), then it is well defined if keys are higher/lower than other keys. For tries with dynamically sized keys: is
0x01higher than0x0100? Or Is it higher than0x0001? This is not trivial and is currently not implemented (note: Geth also does not implement it) -
This is part of the get range proof logic: if there are no key/values found in the range, then return the rightmost key/value. So this is part of the protocol and not necessarily part of the method.
@jochem-brouwer
Take a look at this when you get the chance.
I'm doing a bit more cleanup on the PR, but createRangeProof is passing the tests that were already written when I picked this up.
There are some obvious in-efficiencies to the current solution that I could continue to work on, but if you agree that the method works as is, we could address that stuff in a future PR
@ScottyPoi I will review today. I just started on it, but I noticed there are not many comments around, which makes it rather hard to review, since some lines are rather complex and I have a hard time figuring out what they do :thinking: Could you add some comments + documentation of the added methods?
What is the state of this PR??
I think we should drop this PR. The idea of all this createRangeProof was to create an intermediate createRangeProof before we had a flat database. Since this is getting there soon (TM), the flat DB implementation is more trivial than using the current tree-based library. (I initially thought with a tree database, it would not be trivial, but also not super complex, but this turned out to be rather complex especially with these edge cases)
We can address this with snap DB.