triton-vm U32 Table Design explorations

There are multiple ways to achieve the current functionality of the U32 Table. Let's explore the effects of different designs on both TritonVM as a whole and the recursive verifier specifically.

Re-Design Exploration 1 – Combine Permutation Arguments

Currently, there are 5 Permutation Arguments between the Processor Table and the U32 Table. The running product for all these Permutation Arguments is only updated in rows where the indicator idc is 1.

By adding column current instruction ci to the U32 Table, we can get away with only one Permutation Argument, which is conditional on not only on the indicator idc but also on ci.

Immediate upsides:

fewer columns in U32 Table
fewer columns in Processor Table

Immediate downsides:

AIR of slightly higher degree for U32 Table
AIR of slightly higher degree for Processor Table
no re-usability of different result columns for same input

Re-Design Exploration 2 – Combine Result Columns

The Processor conditions the legal relation of two consecutive rows based on the current instruction ci. If we add ci as a column to the U32 Coprocessor, we can do exactly the same thing. This allows us to combine currently existing columns LT, AND, XOR, REV into a single column OP_RESULT.

Immediate upsides:

fewer columns in U32 Table

Immediate downsides:

considerably more complex AIR for U32 Table

Re-Design Exploration 3 – Ditch U32 Table

Instead of using a dedicated Coprocessor for u32 operations, we can enhance and use the Processor directly. This requires decomposition into bits using helper variables, which can then be immediately combined into the desired result. In conjunction with re-design exploration 4, the negative impacts are less severe by a factor of 8.

Immediate upsides:

less high-level complexity of Triton VM

Immediate downsides:

many, many more columns in the Processor Table
somewhat more complex design of the Processor

Re-Design Exploration 4 – Add U8 Lookup Table

Currently, the U32 Coprocessor deals with one bit at a time. Having access to a static, pre-computed lookup table for bytes allows the U32 Coprocessor to deal with one byte at a time. Correct lookup is proven via some argument, and might require the introduction of a new argument type to Triton VM.

Immediate upsides:

faster U32 Coprocessor

Immediate downsides:

higher design complexity

(Originally issue number 26 in the internal issue tracker.)

Aug 08 '22 08:08 jan-ferdinand

(Originally posted by @bobbinth.)

In Miden, we have a separate co-processor for bitwise u32 operations: AND, OR, and XOR. These co-processor decomposes inputs into individual bits (4 bits per row), computes operations on individual bits, and then aggregates the result. So, it takes 8 rows to compute a single u32 AND/OR/XOR. As described here, it is actually possible to get rid of 2 of these 3 operations, and emulate everything else with just a single operation.

All other u32 operations we perform directly on the stack. This includes u32add, u32sub, u32mul, u32div etc. (we have 8 such operations in total). All these operations require between 2 and 4 16-bit range checks, but for simplicity we implemented all of them as consuming exactly 4 16-bit range checks.

Through Miden assembly we expose quite a few more u32 operations but all of them are emulated using one of the above operations (e.g., lt is emulated as u32sub swap drop eqz not).

Aug 08 '22 08:08 jan-ferdinand

This issue is de-facto blocked by not having a benchmarking target. See #7.

Aug 08 '22 08:08 jan-ferdinand

See #9 for the type of “new argument type” alluded to in option 4.

Aug 08 '22 08:08 jan-ferdinand

The U32 Table has been dropped for now. This issue can be re-opened if acceleration for u32 operations are deemed necessary.

Oct 17 '22 15:10 jan-ferdinand

Acceleration for u32 operations are deemed necessary.

Dec 09 '22 09:12 jan-ferdinand

@aszepieniec and I have developed the following design for a next iteration of the U32 Table. None of the functionality provided by the U32 Table changes. The new design has fewer rows but more columns, which is beneficial for overall speed.

The base columns are:

name	description
`CopyFlag`	Should the result from this row be sent to the processor?
`LimbWidth`	How wide each of the limbs is, in bits.
`CI`	The instruction the processor is currently executing. Only relevant if `CopyFlag` is set.
`LHS_hi`	The high bits of the left-hand side operand.
`LHS_lo`	The low bits of the left-hand side operand.
`RHS_hi`	The high bits of the right-hand side operand.
`RHS_lo`	The low bits of the right-hand side operand.
`AND_hi`	The bit-wise and of `LHS_hi` and `RHS_hi`.
`AND_lo`	The bit-wise and of `LHS_lo` and `RHS_lo`.
`LT_hi`	Result of the less-than comparison between `LHS_hi` and `RHS_hi`.
`LT_lo`	Result of the less-than comparison between `LHS_lo` and `RHS_lo`.
`log_2_floor_hi`	The number of bits needed to represent `LHS_hi` minus one.
`log_2_floor_lo`	The number of bits needed to represent `LHS_lo` minus one.
`mul_self`	The multiplicity with which the U32 Table has looked up `LHS` and `RHS` in itself.
`mul_processor`	The multiplicity with which the processor has looked up this instruction (with the same operands.)

The U32 Table looks up successively narrower limbs of the operands in itself, essentially creating a (stacked) multi-cascade lookup table. See § 4.2 of the Tip5 paper for an explanation of cascading lookup tables.

The last section of the U32 Table, i.e., when LimbWidth is equal to 1, corresponds to a narrow lookup table that can be computed by the verifier in full. See § 4.3 of the Tip5 paper for an explanation of narrow lookup tables.

The rules for u32 instruction pow are somewhat distinct from the other u32 instructions'. Concretely, the left-hand side LHS, serving as the instruction's base, is not restricted to be in the u32 range. Consequently, the new U32 Table has a dedicated section for instruction pow.

Feb 17 '23 13:02 jan-ferdinand

triton-vm triton-vm copied to clipboard

U32 Table Design explorations

Re-Design Exploration 1 – Combine Permutation Arguments

Re-Design Exploration 2 – Combine Result Columns

Re-Design Exploration 3 – Ditch U32 Table

Re-Design Exploration 4 – Add U8 Lookup Table

triton-vm
triton-vm copied to clipboard