triton-vm
triton-vm copied to clipboard
Instruction Maybe-Wishlist for Recursion and Consensus
This is a tracking issue. We add imagined instructions that could make recursion (or consensus programs) faster.
read_mem_forward
❌
- BEFORE:
*ptr
- AFTER:
(*ptr+n) [element]
- With this instruction, 7 instructions inside the inner loop of (a static variant of)
HashVarlen
becomes 3. - Deprecated:
-
hash_var_len
is only performance critical when the number of to-be-hashed elements is known a-priori, making other approaches feasible -
dot_step
(see below) eliminates the second next important use forread_mem_forward
-
dot_step
✅
- BEFORE:
_ acc2 acc1 acc0 *lhs *rhs
- AFTER:
_ acc2' acc1' acc0' (*lhs+3) (*rhs+3)
- With this instruction, 8 instructions inside the inner loop of
InnerProductOfThreeRowsWithWeights
becomes 1. Stands to reduce `1M to ~65000 cycles. - #272
merkle_step
✅
- BEFORE:
_ merkle_node_idx [Digest; 5]
- AFTER:
_ (merkle_node_idx // 2) [Digest'; 5]
- Moves up one layer in a Merkle tree. A combination of
divine_sibling
andhash
into one instruction. -
Replaces instruction
divine_sibling
. Instructionhash
remains available as-is. - Does not change the height of the op stack, in contrast to
divine_sibling
andhash
, which change the height of the stack by 5 elements each. - Done: 01c04e52df88bd7277c12473c001467dd8369ccf
get_colinear_y
- BEFORE:
_ ax [ay] bx [by] [cx]
(possibly different order) - AFTER:
_ [cy]
- Where $c_y = \frac{a_y-b_y}{a_x-b_x} \cdot (c_x - a_x) + a_y$
- This instruction can reduce the inner loop of
compute_c_values
from 74 instructions to 49. Total cycle count reduction: ~26000.
recurse_or_return
✅
- BEFORE:
_ a b
- AFTER:
_ a b
- Acts like
recurse
if $a \neq b$. Else, acts likereturn
. - Replaces checks of the form
dup <m> dup <n> eq skiz return <loop_body> recurse
, reducing the op stack delta of loop maintenanance. - #288
absorb_from_mem
✅
- BEFORE:
_ mem_ptr [garbage; 3]
- AFTER:
_ (mem_ptr - 10) [garbage; 3]
- Reads 10 elements from RAM, adds them to the Sponge state, and applies the Tip5 permutation to the Sponge state.
- Can be seen as replacement for
read_mem 5 read_mem 5 sponge_absorb
, albeit not a drop-in replacement due to the[garbage; 3]
, which is needed for arithmetization reasons. - Reduces the op stack delta from 10 to 0 compared to the instruction sequence it is intended to replace.
- Done: 6dd9b54aa5558c369a9525fb294454d929c13a9f