snark Small Vec Linear Combinations

Summary

As currently written, a symbolic LC is created on every operation. We should expect that most Symbolic LC's are small, and probably should try to keep things on stack via small vec optimizations, perhaps using https://github.com/servo/rust-smallvec.

We definitely shouldn't switch without a more detailed benchmark suite (getting average symbolic LC size, seeing how much of the time is spent in mallocs, seeing if things actually get faster etc.)

For Admin Use

[ ] Not duplicate issue
[ ] Appropriate labels applied
[ ] Appropriate contributors tagged
[ ] Contributor assigned/self-assigned

Jan 03 '21 21:01 ValarDragon

A separate proposal would be to avoid the generation of too many symbolic LC.

Currently, each AllocatedFp is associated with a Variable in the constraint system. Thus, mul_by_constant and add would all create a new symbolic LC.

One potential direction is to let AllocatedFp uses a linear combination of variables instead of a symbolic LC, if it is optimized for the number of constraints.

Jan 04 '21 05:01 weikengchen

One potential direction is to let AllocatedFp uses a linear combination of variables instead of a symbolic LC, if it is optimized for the number of constraints.

I think this is a great idea. The alternative idea I was thinking of was to introduce drop semantics to variables, but that was growing in complexity very rapidly. (We may still need something of that form to remove symbolic LC's that are intermediate computation steps though, in the case of density optimizations)

Jan 04 '21 16:01 ValarDragon

One potential direction is to let AllocatedFp uses a linear combination of variables instead of a symbolic LC, if it is optimized for the number of constraints.

One way to do this is to change the Variable enum as follows:

pub enum Variable<F> {
	/// Represents the "zero" constant.
    Zero,
    /// Represents of the "one" constant.
    One,
    /// Represents a public instance variable.
    Instance(usize),
    /// Represents a private witness variable.
    Witness(usize),
    /// Represents of a linear combination.
    SymbolicLc(InlineOrIndex<F>),
}

// Needed if we want to make this an implementation detail.
pub struct InlineOrIndex<F>(InlineOrIndexInner<F>);

enum InlineOrIndexInner<F> {
	Index(usize),
	Lc(StackVec<(F, Variable<F>)>), // Not sure if this needs to be `Box`
}

The downside of this approach is that Variable is no longer Copy, but can only be Clone; this is a breaking change. Another breaking change is the addition of the Field type parameter F.

Jan 04 '21 18:01 Pratyush

Separately, one might want to reduce the memory consumption of Variable by replacing usize with u32. We could also try u48 if u32 is too small; this can be implemented as a new type around [u8; 6] or [u16; 3] or (u32, u16).

Along with coefficient-interning (#122), this could lead to a significant memory improvement.

Jan 04 '21 18:01 Pratyush

I feel like we will need to eventually remove Copy from Variable anyway. Even in the case of optimizing for density, I think we want to add some drop semantics so that we can prune intermediate, only used once, symbolic LC's from memory. (Which then necessitates removing copy)

Separately, one might want to reduce the memory consumption of Variable by replacing usize with u32. We could also try u48 if u32 is too small; this can be implemented as a new type around [u8; 6] or [u16; 3] or (u32, u16).

Its not clear to me that that is going to be reducing memory under the inline or index API. Symbolic LC could still be storing an LC, which is a vec, which AFAIU takes the same size as 3 * sizeof(usize). (So the stack size taken would increase)

In the case where it doesn't have inline or index, its still not clear to me that it reduces size, since the enum itself has memory overhead AFAIU. (One byte for under 256 options, but I don't know how this is aligned)

I think changing FpVar to directly use a linear combination in both cases may work, since the LC can just store the symbolic variable (which is equivalent to what is already happening)

Jan 04 '21 19:01 ValarDragon

Its not clear to me that that is going to be reducing memory under the inline or index API. Symbolic LC could still be storing an LC, which is a vec, which AFAIU takes the same size as 3 * sizeof(usize). (So the stack size taken would increase)

You're right, that optimization is not useful. Separately, to compress the size for SymbolicLC, we could Box that particular variant of the enum, which should bring the size down to size_of::<usize>().

I think changing FpVar to directly use a linear combination in both cases may work, since the LC can just store the symbolic variable (which is equivalent to what is already happening)

The reason I wanted to put this in ConstraintSystem and not just in FpVar is to enable this for other gadgets too.

Jan 05 '21 03:01 Pratyush

In the case of density optimizations, I think we want to keep building it up in LC representation, and condense that into a saved Symbolic LC every time that variable is used in a constraint. That approach has the problem of many operation may mutate both argument. (e.g. a *= b mutates b).

An alternative to that is that we introduce total_num_refs + drop semantics to Variable. Every time a variable is cloned we increment its total_num_refs. If a variable is dropped and its total_num_refs was 1, we know it was an intermediate variable, and it can be pruned from the symbolic LC map. (This approach should still have worse performance than the first approach, since we save the symbolic LC and remove it later, instead of just never saving it in the first place)

Jan 05 '21 19:01 ValarDragon

Copying over from another issue:

I think there are two approaches:

We have linear combinations either in the variable directly, or in the lc_map, but not in both. This way, if we inline LCs, then when the variable goes out of scope, the memory for the LC is freed immediately, and we don't have to clean up the lc_map.
We hold a Rc<RefCell<LcMap>> or something inside a SymbolicLc, and when the SymbolicLc gets dropped, we look inside the LcMap and do some garbage collection there. Not sure how expensive this would be, though. This latter approach is similar to your approach

Separately, it would also be worthwhile to consider what can be done for density optimizations; the foregoing mostly only makes sense when optimizing for constraints.

Jan 05 '21 19:01 Pratyush

If a variable is dropped and its total_num_refs was 1, we know it was an intermediate variable, and it can be pruned from the symbolic LC map.

Every variable is eventually dropped, so we can't necessarily prune, if we want to have the "outlining" optimization available to use.

Jan 05 '21 19:01 Pratyush

Every variable is eventually dropped, so we can't necessarily prune, if we want to have the "outlining" optimization available to use.

Err total_num_refs was probably the wrong name. I meant it as max_num_refs, e.g. if there was ever more than 1. On further thought, this should instead be a boolean. (Perhaps re_used_lc as the name) And then every time a constraint is created from an LC, that lc gets saved to the LC map.

Jan 05 '21 19:01 ValarDragon

snark snark copied to clipboard

Small Vec Linear Combinations

Summary

For Admin Use

snark
snark copied to clipboard