noir icon indicating copy to clipboard operation
noir copied to clipboard

ACIR gen / ACVM: Handle non-inlined ACIR functions

Open vezenovm opened this issue 1 year ago • 1 comments

Problem

After https://github.com/noir-lang/noir/issues/4427 we will have an opcode that tells us whether we should call a separate ACIR. We have no ability in the compiler to codegen separate ACIR functions as we currently inline all ACIR functions earlier during SSA gen.

Happy Case

TLDR; Jump to number (2) for the selected solution forward. I have kept the full initial comment for record keeping purpsoes.

We will need a new function attribute such as #[fold] that tells the compiler whether a certain ACIR function should be inlined during SSA gen. We then need ACIR gen to appropriately handle generating ACIR for multiple function calls and storing each ACIR as part of our main Circuit struct that gets sent to the backend. We could either do this with a recursive structure or a flat vector of circuits:

  1. Recursive Circuit type

The new circuit struct should look something like such:

pub struct Circuit {
    pub opcodes: Vec<Opcode>,
    pub circuits: Vec<Circuit>, // or Vec<Vec<Opcode>> as we may not want a recursive `Circuit` type
    // ... other fields
}

Each id in a Call opcode will point to an ACIR function. We want IDs as we want one ACIR per function. ACIR gen and the execution environment should be able to handle this requirement.

Making sure we appropriately codegen the function calls will not be trivial and will also require updates from the backend to appropriately handle the new Circuit type. Initially a backend can just perform ACIR function inlining on its own until it is ready to integrate more advanced features such as folding.

  1. Alternative flat list of circuits

Ultimately the backend needs two things to handle a stack of function calls:

  1. A list of circuits in the order to be executed
  2. An accumulated witness. The executor will be responsible for correctly offsetting witness values so that a function can fetch the appropriate input scalars.

Rather than a recursive circuit structure we should have the following:

pub struct Program {
    // Each ACIR circuit with the main function representing the first program
    pub functions: Vec<Circuit>,
    // other relevant info such as ABIs
}

We should initially construct a singular witness map for the entire call stack execution. It will then be up to the executor to set up the appropriate final call stack to send to the backend with the respective witness for each function to execute. It is important to note that the same circuit can be called multiple times. For these cases we will have the same ACIR, but a larger witness which needs to be appropriately handled.

I could see the structure getting transformed into something like this:

pub struct Program {
     // Each ACIR circuit with the main function representing the first program
     pub functions: Vec<Circuit>,
     // A call stack of the witness to use for the respective ACIR. The size of this witness stack should be 
     // the same as the number of `Call` instruction in 
     pub witnesses: Vec<WitnessMap>,  
     // other relevant info such as ABIs
}

From this list of function calls and witness maps the backend can appropriate construct multiple circuits to be folded together.

Project Impact

None

Impact Context

No response

Workaround

None

Workaround Description

No response

Additional Context

No response

Would you like to submit a PR for this Issue?

None

Support Needs

No response

vezenovm avatar Feb 26 '24 17:02 vezenovm

I'm going forward with the flat list of circuits approach.

To expand the initial proposal listed above:

We probably need to store some index with the list of witnesses to more accurately represent a function stack.

Program {
    functions: Vec<Circuit>,
}

WitnessStack {
    stack: Vec<StackItem>,
}

StackItem {
    // Index into the program's function list for which we have an associated witness
    index: u32,
    // Function witness
    witness: WitnessVector,
    // potentially other information such as inputs/outputs to be copy constrained by the backend between stack items
}

The backend will then do something like such:

auto constraint_systems: vector<acir_format::AcirFormat> = get_program(bytecode);
auto witness_stack: vector<(index, WitnessVector)> = get_witness_stack(bytecode);

while (!witness_stack.empty()) {
    let witness = witness_stack.pop_back();
    let constraint_system = constraint_systems[witness.0];

    // create foldable circuit
    // accumulate and prove per loop step
    // verify after the loop is finished
}

Some pseudocode showing how the final program and stack passed to the backend may look:

main(x) -> out {
    let out = A(x);
    A(out + 2)
    // Not enough to predicate just the call, read below for more
    // B(out) * predicate
    // C(out) * !predicate
    if x == 1 {
        B(out, predicate)
    } else {
        C(out, predicate)
    }
}

C(input) -> out {
    assert(input == 5) * predicate
    input + 1 * predicate
}

B(input) -> out {
    assert(input == 5) * predicate
    (input + 2) * predicate
}

B ACIR: 
    AssertZero(input - 5 == 0)
    x0 + 2

A(input) -> out {
    B(input)
}

A ACIR:
    Call { id: 2, inputs: [x0] },
    A witness: { x0 }

main ACIR:
    Call { id: 1, inputs: [x0], }
    x1 + 2 = x2;
    AssertZero(x2 - 2 == 0)
    Call { id: 1, inputs: [x2], .. }
    // Calls with side effects 
    Call { id: 2, inputs: [x2, predicate], .. }
    Call { id: 3, inputs: [x2, predicate], .. }

Stack if true:
B, A, B, A, B, main
2, 1, 2, 1, 2, 0

Stack if false:
B, A, B, A, C, main
2, 1, 2, 1, 0

How do we guarantee that a prover does not leave out `B` or `C` when folding?
We must maintain the same stack.

Stack must remain the same (use a predicate):
B, A, B, A, B, C, main

vezenovm avatar Mar 05 '24 16:03 vezenovm