crypto-bigint icon indicating copy to clipboard operation
crypto-bigint copied to clipboard

Faster vartime division for Uint

Open andrewwhitehead opened this issue 1 year ago • 6 comments

Implements faster vartime division (vartime with the divisor only) for Uint<L> based on Knuth's TAOCP volume 2, as outlined at https://janmr.com/blog/2014/04/basic-multiple-precision-long-division/

This does not address vartime division for BoxedUint or the other TODOs in https://github.com/RustCrypto/crypto-bigint/pull/511

Relevant benchmarks:

wrapping ops/div/rem_vartime, U256/U128, full size
                        time:   [84.091 ns 84.290 ns 84.525 ns]
                        change: [-85.312% -85.155% -84.896%] (p = 0.00 < 0.05)
                        Performance has improved.
wrapping ops/rem_vartime, U256/U128, full size
                        time:   [84.129 ns 84.474 ns 84.890 ns]
                        change: [-84.208% -84.057% -83.898%] (p = 0.00 < 0.05)
                        Performance has improved.
wrapping ops/rem_wide_vartime, U256
                        time:   [166.95 ns 167.27 ns 167.69 ns]
                        change: [-94.750% -94.705% -94.636%] (p = 0.00 < 0.05)
                        Performance has improved.
Dynamic Montgomery arithmetic/MontyParams::new_vartime
                        time:   [459.28 ns 460.48 ns 461.88 ns]
                        change: [-80.512% -80.452% -80.390%] (p = 0.00 < 0.05)
                        Performance has improved.

andrewwhitehead avatar Jun 12 '24 23:06 andrewwhitehead

It looks like Uint::mul_mod would also be 90% faster using this update and split_mut/rem_wide_vartime, and could accept an even modulus. The current implementation uses MontyParams::new_vartime so it seems like it is vartime for the modulus, but maybe that should be renamed mul_mod_vartime anyway? Accepting a &NonZero<Uint> for the modulus would also simplify things.

andrewwhitehead avatar Jun 13 '24 17:06 andrewwhitehead

The current implementation uses MontyParams::new_vartime so it seems like it is vartime for the modulus, but maybe that should be renamed mul_mod_vartime anyway?

That's a good point... perhaps there should be both Uint::mul_mod as well as Uint::mul_mod_vartime with the former using MontyParams::new? (looks like it would need some extra bounds)

tarcieri avatar Jul 26 '24 17:07 tarcieri

@fjarri thoughts on this versus #511?

tarcieri avatar Jul 26 '24 18:07 tarcieri

This is definitely much faster than what I have there, and as for the naming stuff, I'll make a separate PR.

fjarri avatar Jul 27 '24 00:07 fjarri

This splits Uint::mul_mod and Uint::mul_mod_vartime: https://github.com/RustCrypto/crypto-bigint/pull/623

tarcieri avatar Jul 27 '24 15:07 tarcieri

I also have vartime division for BoxedUint implemented, which I was planning to add in a separate PR.

andrewwhitehead avatar Jul 27 '24 15:07 andrewwhitehead

Just a note: I experimented a little with trying to piggyback rem_wide_vartime() on div_rem_vartime(), and the following kinda works:

    pub const fn rem_wide_vartime<const R: usize>(lower_upper: (Self, Self), rhs: &NonZero<Self>) -> Self
    where
        Self: Concat<Output=Uint<R>>,
        Uint<R>: Split<Output=Self>,
    {
        let (lo, hi) = lower_upper;
        let wide_self: <Self as Concat>::Output = lo.concat(&hi);
        let (_q, r) = wide_self.div_rem_vartime(rhs);
        r
    }

The problem is, it then requires an <R> generic parameter in some Monty trait methods, which doesn't make sense when they're implemented for BoxedUint. So yeah, we'll have to wait until const traits are a thing so that we can say "Concat::Output is an Uint (or Integer)" without spelling out the concrete type.

fjarri avatar Aug 01 '24 22:08 fjarri