o1js icon indicating copy to clipboard operation
o1js copied to clipboard

Promote indexed merkle map to standard API

Open 45930 opened this issue 7 months ago • 4 comments

This API is tested and documented, (more).

It seems ready to leave experimental, to me. If not, what do we need to do?

45930 avatar Apr 01 '25 17:04 45930

The current API has a subtle gotcha that I think should be mitigated.

The map 'commitment' consists of two field elements: root and length. It's easy to miss the length and only check for the root.

They should be combined in some way so that you include/check against both fields by default.

mitschabaude avatar Apr 01 '25 18:04 mitschabaude

@mitschabaude I'm curious - is there actually a collision risk with trees of different lengths? Or else why would we need to include the tree length?

45930 avatar Apr 02 '25 18:04 45930

I recently also implemented a very similar tree like this in protokit, and i ended up not including the maximum occupied index (being length in this case i assume). Two reasons for this:

  1. I ended up realizing that you don't actually need to strictly increment the placement of new leafs in the tree. Its optimal for efficiency reasons, but the data structure stays sound even if you arbitrarily place indexed leafs in the underlying sparse tree. The only thing you need to verify is that you place it in some empty sparse tree leaf, which one does anyways by checking the merkle witness with 0.
  2. If you do stuff to the tree, you are expected to have all relevant data to the tree reconstructed somewhere anyways, therefore you can always compute that value locally.

rpanic avatar Apr 15 '25 20:04 rpanic

  1. Is exactly the reason length is needed for IndexedMerkleMap in its current form. The index is not published as part of the action, so inserting your data somewhere else is an attack vector

mitschabaude avatar Apr 15 '25 20:04 mitschabaude

@mitschabaude I am trying to understand the exact attack vector. How does an attack vector possible when the commitment (root) is unique to the tree (except if hash collisions). And if there's an attack vector, how should we fix it? Should we add indices to the commitment?

boray avatar Jun 05 '25 17:06 boray

@mitschabaude I am trying to understand the exact attack vector.

Here's the attack vector:

  1. Assume your zkApp stores an IndexedMerkleTree that is updated in a decentralized manner.
  2. Assume that the IndexedMerkleTree API is used incorrectly and length is not stored on-chain.
  3. Then a user who performs an update can witness any value as length, larger than the actual length
  4. Which means they can do an insertion anywhere in the tree
    (note: the circuit enforces that insertions are always done at the length, and then increment the length by 1)
  5. Nobody knows where this insertion was done! So, nobody can reconstruct the correct tree now.
  6. Nobody except the malicious user can produce Merkle paths anymore, and therefore nobody can do anything with the tree. The contract is broken.

Note that this attack vector only exists if the length is not committed onchain.

The existing examples do correctly store the length, e.g. here: https://github.com/o1-labs/o1js/blob/a6b65037a7ac1f1dbe647bb3767d00882cc3b02f/src/lib/mina/v1/actions/batch-reducer.unit-test.ts#L59-L63 checked in the contract here: https://github.com/o1-labs/o1js/blob/a6b65037a7ac1f1dbe647bb3767d00882cc3b02f/src/lib/mina/v1/actions/batch-reducer.unit-test.ts#L87-L89

The check this.eligibleLength.requireEquals(eligibleMap.length) prevents against step (3) of the attack above: It forces the length to be the actual length of the tree, and therefore forces new insertions to be right after the existing values. That matches the assumption of every other user who keeps track of the tree, and everyone is able to reconstruct the tree after your update.

My point was that this check is easy to miss, because it's different from the old MerkleTree API, where you only store the root onchain and not the length.

how should we fix it?

My proposal above was to automatically hash both the root and length into a single commitment, and store that onchain. This prevents step (3) of the attack above.

The new commitment could simply be called root for conformance with the usual API. So, what I would do is

  • change the name of the current root field to something like _internalRoot.
  • define a public getter method
get root() {
  return Poseidon.hash([this._internalRoot, this.length])
 }

Now since both length and root are included, it's safe and correct to only store tree.root onchain, and only do a single check in your contract:

// in zkapp method, when witnessing the map
let map = Provable.witness(IndexedMerkleMap, () => ...);
this.onchainRoot.requireEquals(map.root);

// ... do Merkle updates ...

// when updating the onchain state
this.onchainRoot.set(map.root);

mitschabaude avatar Jun 06 '25 07:06 mitschabaude

The way we solve this issue in protokit is that instead of making the length part of the commitment, we include the index where the new leaf was inserted in the emitted event (or DA in our case). This unlocks the potential to use another allocation algorithm instead of strictly incrementing. Also it keeps the commitment a "pure tree root", making witnesses not need the length at all.

rpanic avatar Jun 09 '25 14:06 rpanic