mithril icon indicating copy to clipboard operation
mithril copied to clipboard

Unbalanced Trees for Unbalanced Stake Distributions

Open pyrros opened this issue 3 years ago • 2 comments

Currently, the Merkle tree used for the AVK is balanced which implies that signatures produced by any two parties will have the same length. Since parties with high stake are expected to produce signatures more often, it will be beneficial to place them in short branches. Vice versa, parties with low stake may be placed deeper with little cost (if amortized over the frequency of signing).

A variant of Huffman coding may be the optimal choice for this, but allowing the path length to vary completely might be too complex for circuit based systems. As an inbetween solution, consider a root with two subtrees: a “short” tree with the 2^Y largest stakeholders, and a “long” tree with everybody else.

Motivational Example

On June 08, Cardano had 2800 stake pools. This rounds up to a tree with 2^12 leaves and paths of length 12.

From these pools, the Top 512, held 92% of the stake. A subtree for these users would have paths of length 9 (+1 due to the subtree). The other users would lie in a tree with paths of length 12 (+1 due to the subtree). This implies that ~92% of signatures would have length 2 shorter than before, and ~8% would be one longer. On average, a signature would be shorter by 1.76 steps (~14%).

Changes in the primitive:

Yes. Verification must be able to handle paths of varying lengths (might be implicitly supported in the current version). Aggregate key generation will need to be changed to produce unbalanced trees. Changes in node logic: None.

Relevant to BP/Halo:

Limited. Circuit-based systems do not work well with conditional branches as we often need to represent both branches in the circuit and thus pay the cost of both options (or the more costly of the two if we can fold them together). This implies that a single circuit approach will obtain no benefit from the above.

For the basic short/long option, we might opt to prepare a number of preset circuits, each having a different mix of short and long branches opting to use the smallest circuit that is appropriate at each point. This will require some amount of effort and limit the benefits to a degree.

Security impact:

None. (Also see Scaling Potential).

Scaling Potential:

Incentive structures in place may keep the size of the stakeholder pool close to its current value.This implies that the savings percentage will likely not increase further. On the other hand, the above proposal can maintain performance against malicious/capricious users who wish to operate a significant number of stake pools with insignificant stake.

pyrros avatar Oct 19 '21 12:10 pyrros

Seems a bit like premature optimisation to me.

abailly-iohk avatar Oct 20 '21 12:10 abailly-iohk

An efficiency improvement that would be easy to implement with balanced trees with unbalanced distributions, is that we could order the keys by its stake distribution (and not by the x-coordinate of the public key). This would keep parties with large amounts of stake together, which will result in better batched membership proofs. The most efficient representation of a batch proof is when two leafs are siblings (as they share all the path). If they are opposite in the tree, then the batched path will be maximal (we are only saving the root). So by ordering large stakeholders together in the merkle tree, we increase the likelihood of two signers being close in the tree.

iquerejeta avatar Apr 28 '22 11:04 iquerejeta