Fields with lazy reduction
When exposing FieldElement from curve25519-dalek (https://github.com/dalek-cryptography/curve25519-dalek/pull/787), it was noted how curve25519-dalek only reduces its field element occasionally and how the proposed PR fails to model that in the name of safety. https://github.com/dalek-cryptography/curve25519-dalek/pull/813 resolved this by introducing LazyField, a trait for a field which only occasionally performs reductions, using typenum to track capacity consumption via the Rust type system. This allows reducing when reductions occur, while ensuring operations remain well-defined.
In practice, for performance, there's no reason not to use LazyFieldWithCapacity<U1>. Any existing field can be wrapped with EagerField to achieve API compatibility. Any field with any capacity is benefited. It is suboptimal for fields with even greater capacity, yet those would already lose their benefit if Field alone was used (which would mandate performing a reduction after every single operation to remain well-defined in a constant-time context).
Ideally, these traits do not permanently reside in curve25519-dalek yet are upstreamed somewhere they can achieve wider adoption from. This would mean ff(which so far hasn't adopted typenum) or somewhere in the RustCrypto ecosystem (primefield, elliptic-curve, or a new crate). I wanted to create this issue to discuss the traits and where would be optimal for them. I'd also like to invite review over them. While I believe my prototype accomplishes its goals, and is fine to be published under curve25519_dalek::hazmat for now to accomplish the goals of finally exposing the curve25519_dalek FieldElement type, I also believe they could benefit from further review and fine-tuning.
One alternative to involving typenum is a much coarser grained approach where add/double/sub/neg on the non-lazy type return the lazy field element type, and multiplication/square defined on the lazy type (and also on the non-lazy type) all return the non-lazy type.
That doesn't let you fully leverage the extra space, particularly for a 255-bit curve like Curve25519, but would be significantly simpler, avoids typenum (I say despite the fact we use it everywhere else), and means every curve with a lazy field implementation could expose that as a least common denominator.
It also means protocol implementations that leverage lazy reductions don't need to have special case behavior based on the number of operations that can be performed prior to a reduction.
I'm not sure what I'm proposing has sufficient expressive power versus what you're proposing, however. I think implemented correctly we could use this abstraction in primeorder to implement scalar multiplication in such a way it could leverage the lazy field implementations in both k256 and p521, and then k256 would no longer need a separate implementation of scalar multiplication (and p521 could actually benefit from its lazy field element implementation). Perhaps that would make a good acid test for the trait design?
If the goal is completely avoiding typenum, that's fine. If the goal is avoiding the API pain which is typenum (due to Rust not supporting const generics with sufficient functionality, of course), then we can add dedicated traits without typenum in their API, which are automatically implemented for the LazyField trait. This lets users choose to be comprehensive via typenum or reasonable via only allowing one step of reduction (which I've advocated as a default prior). I don't think we have to move entirely over to the simpler API, just also offer it.
Re: typenum as a dependency: Anything which pulls generic-array, hybrid-array, or digest will include it (at least indirectly). The RustCrypto curves will all include typenum accordingly. My own work bundles digest with group. curve25519-dalek has a from_hash API, and now a hash to curve API, bundling digest. The only curves without typenum in tree, in the group ecosystem, may be zkcrypto/ZCash's (pasta_curves doesn't even include zeroize :/ ). While it's probably a blocker to upstream into ff, I don't see it as any other issue. The only question would be if we want to try to upstream into ff the simpler API, and then either leave the more complex API under RustCrypto or abandon it as unlikely to be used.
It was mostly for the sake of argument, based in part on @fjarri's arguments.
That said, I am still curious exactly how you would use lazy reduction to implement an algorithm that would be used across multiple curves/fields where the number of lazy reduction operations you can perform varies from curve-to-curve/field-to-field.
I think it's probably possible to wrap a WithCapacity<2> into a WithCapacity<3>, as we have EagerField to wrap a Field into a WithCapacity<1>, letting whoever writes the function declare how much capacity it can take advantage of. I just haven't investigated actually implementing that, but it'd be optimal 'once and for all'.