mithril
mithril copied to clipboard
Truncated Tree Traversal
Currently, mithril certificates involve verifying a few hundred paths over a Merkle tree. For the parts of the paths that are close to the root, there will be large amounts of overlap. For example, there are exactly two children under the root. We thus expect ~half of the paths to use one child and the rest to use the other. This implies that both children appear multiple times in the certificate.
A simple approach to mitigate this is to select the X-th layer of the tree and expose it. This will require 2^X hashes in space. However, it will also allow us to truncate paths at the X+1 layer, rather than the root, saving k * X hashes. For X ~ log(k), this is a clear gain.
Motivational Example:
For k~256, we set X=8. We use 256 hashes to represent the exposed layer, but save 8 steps from each path. Effectively, each path is shortened by 7 steps.
Trees of 2^ [10/20/30] leaves have paths of length 10/20/30. Additionally, leaf preimages, ev values, signatures and path data add the equivalent of ~5 steps for each party. Thus, the approximate savings are: 7/15, 7/25, 7/35, i.e. 46%, 28% and 20%. Respectively.
For the current pools size of ca 2^12, this gives a 7/17 i.e. 42% size reduction.
Changes in the primitive:
Potentially. We can either change certificate aggregation and verification to expect compressed paths or add a pair of compression/decompression functions to convert between the two representations.
Changes in node logic:
Minimal. Nodes might need to store the “exposed” layer rather than only the AVK in order to aggregate.
Relevant to BP/Halo:
Yes. The benefits will be similar albeit smaller if the circuit uses a higher arity for the tree. The circuit will be hardcoded to expect a compressed representation of the tree, but the end result will be a more compact circuit. NB:there is no need for multiple versions of the tree here, we only need the compressed one.
Security impact:
None.
Low-Effort Alternative:
We can try and use off the shelf compression to exploit the overlap. This will likely provide smaller benefits as a compression function can not infer the value of a parent from the values of the children (and thus omit all parents after the exposed layer), but will be much simpler to implement. This low-effort approach does not work for circuit-based proofs however.