triton-vm icon indicating copy to clipboard operation
triton-vm copied to clipboard

Instruction Maybe-Wishlist for Recursion and Consensus

Open aszepieniec opened this issue 10 months ago • 13 comments

This is a tracking issue. We add imagined instructions that could make recursion (or consensus programs) faster.

read_mem_forward

  • BEFORE: *ptr
  • AFTER: (*ptr+n) [element]
  • With this instruction, 7 instructions inside the inner loop of (a static variant of) HashVarlen becomes 3.
  • Deprecated:
    • hash_var_len is only performance critical when the number of to-be-hashed elements is known a-priori, making other approaches feasible
    • dot_step (see below) eliminates the second next important use for read_mem_forward

dot_step

  • BEFORE: _ acc2 acc1 acc0 *lhs *rhs
  • AFTER: _ acc2' acc1' acc0' (*lhs+3) (*rhs+3)
  • With this instruction, 8 instructions inside the inner loop of InnerProductOfThreeRowsWithWeights becomes 1. Stands to reduce `1M to ~65000 cycles.
  • #272

merkle_step

  • BEFORE: _ merkle_node_idx [Digest; 5]
  • AFTER: _ (merkle_node_idx // 2) [Digest'; 5]
  • Moves up one layer in a Merkle tree. A combination of divine_sibling and hash into one instruction.
  • Replaces instruction divine_sibling. Instruction hash remains available as-is.
  • Does not change the height of the op stack, in contrast to divine_sibling and hash, which change the height of the stack by 5 elements each.
  • Done: 01c04e52df88bd7277c12473c001467dd8369ccf

get_colinear_y

  • BEFORE: _ ax [ay] bx [by] [cx] (possibly different order)
  • AFTER: _ [cy]
  • Where $c_y = \frac{a_y-b_y}{a_x-b_x} \cdot (c_x - a_x) + a_y$
  • This instruction can reduce the inner loop of compute_c_values from 74 instructions to 49. Total cycle count reduction: ~26000.

recurse_or_return

  • BEFORE: _ a b
  • AFTER: _ a b
  • Acts like recurse if $a \neq b$. Else, acts like return.
  • Replaces checks of the form dup <m> dup <n> eq skiz return <loop_body> recurse, reducing the op stack delta of loop maintenanance.
  • #288

absorb_from_mem

  • BEFORE: _ mem_ptr [garbage; 3]
  • AFTER: _ (mem_ptr - 10) [garbage; 3]
  • Reads 10 elements from RAM, adds them to the Sponge state, and applies the Tip5 permutation to the Sponge state.
  • Can be seen as replacement for read_mem 5 read_mem 5 sponge_absorb, albeit not a drop-in replacement due to the [garbage; 3], which is needed for arithmetization reasons.
  • Reduces the op stack delta from 10 to 0 compared to the instruction sequence it is intended to replace.
  • Done: 6dd9b54aa5558c369a9525fb294454d929c13a9f

aszepieniec avatar Mar 26 '24 11:03 aszepieniec