Release tracking issue: structure crypto primitives to support multiple field arithmetic implementations
Is your feature request related to a problem? Please describe.
Our existing crypto primitives are all built on top of field arithmetic implementations pulled from Arkworks, originally for convenience, since we could pull them off the shelf and start using them. However, we're now experiencing pain points exposing the limitations of this approach:
- In the hardware wallet context, we can't easily run our crypto code in a
no_std+no_alloccontext, since the Arkworks traits require allocation. - In the WASM context, we have terrible performance because Arkworks implements big-integer field arithmetic using 64-bit words and a 128-bit multiplier, but because WASM does not provide such a multiplier, each "base" operation must itself be emulated, causing a huge slowdown.
To deal with both of these problems, we need to take ownership of deeper parts of our cryptography primitives, and ensure our implementations are tuned to our needs. In particular:
- We need to be able to use a minimal subset of our cryptography (everything needed to compute an
EffectHashand aSpendAuthsignature) in ano_std+no_alloccontext. - We need to be able to select between 32- and 64-bit field arithmetic implementations. We want to use 64-bit arithmetic by default, but be able to select 32-bit arithmetic for hardware wallets (no proving, no Arkworks,
no_std,no_alloc) and for WASM (proving, with Arkworks)
Describe the solution you'd like
Here's a proposed approach that I believe could address both of these issues.
-
[x] penumbra-zone/decaf377#60
We'll need one implementation for each combination of field and bitsize:
- [x]
Fp32: proving curve base field, 32-bit - [x]
Fq32: proving curve scalar field /decaf377base field, 32-bit - [x]
Fr32:decaf377scalar field, 32-bit - [x]
Fp64: proving curve base field, 64-bit - [x]
Fq64: proving curve scalar field /decaf377base field, 64-bit - [x]
Fr64:decaf377scalar field, 64-bit
The field arithmetic should not have external dependencies, should not use the Arkworks traits, and should not allocate. This is a lot of field arithmetic implementations. Instead of writing them by hand, let's use
fiat-crypto's Rust backend to generate formally verified field arithmetic for each of our field / wordsize combinations. Ideally, we won't have to edit the generated code much or at all, but we'll need to do some discovery there.Each of these field arithmetic implementations should be in separate submodules (e.g.,
crate::fields::{u32::{fp::{Fp32, ...}, ...}, ...}, using the facade pattern for re-exports. - [x]
-
[x] https://github.com/penumbra-zone/decaf377/issues/65
The initial wrapper types don't have inversion implemented. We'll need that.
-
[x] penumbra-zone/decaf377#61
Using a similar strategy as I designed for
curve25519-dalek, provide a way to select a backend at compile-time:- [x] Add
u32_backend,u64_backendfeatures in the crate - [x] Set the
u64_backendas a default feature - [x] Add
#[cfg]-gated type aliases, so that-
crate::fields::{Fp, Fq, Fr} = crate::fields::{Fp32, Fq32, Fr32}when theu32_backendis enabled; -
crate::fields::{Fp, Fq, Fr} = crate::fields::{Fp64, Fq64, Fr64}when theu64_backendis enabled; - compilation fails with a helpful error if either (a) no backend is selected or (b) multiple backends are selected.
-
Note that this approach is slightly "illicit", in that crate features are supposed to be purely additive. This will mean we'll need to propagate annoying feature selection logic through our dependency tree (see the way that
curve25519-dalekandx25519-dalekorbulletproofsbackend features interact). However, the really big upside, the upside that makes it all worthwhile, is that unlike a generics-based approach, the backend selection can be transparent to depending code (though not to dependingCargo.tomlconfigurations). - [x] Add
-
[x] penumbra-zone/decaf377#62
Next, restore compatibility between our new field arithmetic implementations and the Arkworks stack. This will involve writing implementations of the Arkworks field arithmetic traits that wrap whatever interface the
fiat-cryptocodegen creates. The trait implementations should be in separate files, so we can feature-gate the implementation under anarkworksfeature in a single spot (the submodule declaration). Since trait implementations are global, there's no re-exporting required, trait impls can be tucked away in a submodule.We can do a single trait impl first, and then use copy-paste and search-and-replace technology to replicate it across the other fields. Assuming that the
fiat-cryptocodegen produces the same interface, independent of limb size, we could potentially implement the compat traits on theFp,Fq,Frtype aliases, rather than on the backend types. -
[x] https://github.com/penumbra-zone/decaf377/issues/69
-
[x] https://github.com/penumbra-zone/decaf377/issues/70
-
[x] penumbra-zone/decaf377#63
We need to provide a minimal subset of
decaf377operations in ano_std+no_alloccontext. To do this, we'll need to have a basic elliptic curve implementation that doesn't depend onark-ecor any other part of Arkworks. Minimal means all the operations that would be required to produce signatures and effect hashes:- [x] Decode + Encode
- [x] Map-to-Group
- [x] Add, Sub
- [x] Constant-time scalar multiplication (no lookup tables or anything)
Other functionality can be provided by feature-gated wrappers around calls to methods in
ark-ec(see above). If these need to translate between point representations, the extra memcpy's won't be a big deal. For instance, signature verification uses a double-base variable-time scalar multiplication, but this is much more expensive than converting point representations (as long as we're not doing an inversion or something). -
[x] #3673
The
decaf377-rdsacrate should have matching feature-gating to allow selection ofu32oru64backends, and to avoid Arkworks as a required dependency. We only care about producing signatures in the minimalno_std+no_alloccontext, so it's fine if the verification methods become feature-gated. -
[x] #3674
Key agreement is also used for effect hash computations, so we'll need to use it in the minimal context and have corresponding Cargo features.
-
[ ] #3675
We need to use the Poseidon code in the minimal context, too. We can have that crate depend on
decaf377for the field arithmetic implementations and have backend selection features. We'll need to be able to avoid an Arkworks dependency for computing hash values, and only rely on it when making proofs. -
[ ] #3676
It should be possible to re-instantiate all of the circuit implementations with the backend-selectable
Fp,Fq,Frimplementations in thedecaf377crate. This allows unlocking WASM performance by avoiding the 128=>64-bit translation overhead.
I think this subsumes penumbra-zone/penumbra#3512.
fiat-crypto has a web interface, neat: https://mit-plv.github.io/fiat-crypto/?argv=%5B%22word-by-word-montgomery%22%2C%22--lang%22%2C%22Rust%22%2C%22fq%22%2C%2232%22%2C%228444461749428370424248824938781546531375899335154063827935233455917409239041%22%5D&interactive
Some output for Fq: https://gist.github.com/hdevalence/6556634efe15f660919561d2d55b8443
Some output for Fr: https://gist.github.com/hdevalence/3b3c9f9574565b1eaf78cedd888081c7
@hdevalence It's worth pointing out that 32-bit limbs in the WASM context isn't always the ideal limb size depending on the curve construction used. For instance, reference https://github.com/mitschabaude/montgomery and https://hackmd.io/HKHbMzxIQmOQUFNl7CF5sg where 13 x 30 bit montgomery multiplication is optimal. In the GPU environment, since the max unsigned integer type is 32-bits, the max limb-size supported is 16-bits wide. There we see the optimal limb size to be 13-bit limbs.
For the contexts we care about -- the Cortex-M3 on a Ledger, and wasm in the browser -- we do have a 64-bit multiplier, so we can use 32-bit limbs there. I suspect that code sharing with a GPU will be more difficult, so I'd like this implementation to focus on the 32- and 64-bit cases.
I did the first part -- generating code with fiat-crypto -- in https://github.com/penumbra-zone/decaf377/pull/64
I think a good next step would be to work on filling in arkworks compatibility, making it so that the new Fp, Fq, Fr types implement the arkworks PrimeField trait when the arkworks feature is enabled. I created arkworks.rs files for those impls to live in but didn't populate them.
Great! I'll pick up from there tonight