openssl Improve FIPS RSA keygen performance

FIPS 186-4 has 5 different algorithms for key generation, and all of them rely on testing GCD(a,n) == 1 many times.

Cachegrind was showing that the function BN_gcd() was taking a considerable percentage of the total cycles, during a rsa keygen operation.

The default provider uses multiprime keygen, which seemed to be much faster. This is because it uses BN_mod_inverse() instead.

For a 4096 bit key, the entropy of a key that was taking a long time to generate was recorded and fed back into subsequent runs. Roughly 40% of the cycle time was BN_gcd() with most of the remainder in the prime testing. Changing to use the inverse resulted in the cyle count being 96% in the prime testing.

Checklist

[x] documentation is added or updated
[x] tests are added or updated

Nov 02 '22 02:11 slontis

inv gcd

Nov 02 '22 02:11 slontis

My primary concern is that BN_mod_inverse() isn't strictly constant time:

It copies the input BNs, dropping any constant time flags
- It adds constant times flags for the modulo and division operations
- It doesn't for the multiplication and addition
It doing Euclid's algorithm by changing pointers rather than constant time swaps or copies
- This was flagged as badness in #19309
- However, all values are accessed each iteration so exploiting the cache would seem to be a difficult side channel

These points can be addressed easily enough but will cost some performance.

Constant time multiply and add will be fast compared to division and modulo
Constant time BN copy will be much faster than the BN swap in #19309 because it uses memcpy(2) instead of constant time select in a loop

Nov 02 '22 08:11 paulidale

I would like to see even a theoretical attack on this BN_mod_inverse use in the RSA keygen. How would it work given the key is generated once so there is no real chance for the attacker to do repeated measurement to gather some statistics. But yeah, if making BN_mod_inverse consttime would not cost much, it would make sense.

Nov 02 '22 09:11 t8m

If this attack were only theoretical, why did we go to such lengths to be constant time in the BN mod function for keygen? I'm not saying it's exploitable, just that we need to be cautious.

Nov 02 '22 10:11 paulidale

I would like to see even a theoretical attack on this BN_mod_inverse use in the RSA keygen. How would it work given the key is generated once so there is no real chance for the attacker to do repeated measurement to gather some statistics. But yeah, if making BN_mod_inverse consttime would not cost much, it would make sense.

@t8m These seem to be very closely related:

Section 3.3 of https://urn.fi/URN:ISBN:978-952-03-2289-2 (for a brief recap)
https://doi.org/10.13154/tches.v2019.i4.213-242
https://doi.acm.org/10.1145/3196494.3196524

Nov 02 '22 11:11 romen

If the attacker can probe the keygen process often enough (either through power side channel or through microarchitecutral means), they may be able to deduce exact code path taken

That being said, this:

            /* (Step 1) GCD(2r1, r2) = 1 */
            && gcd_isone(tmp, r1x2, r2, ctx)
            /* (Step 2) R = ((r2^-1 mod 2r1) * r2) - ((2r1^-1 mod r2)*2r1) */
            && BN_mod_inverse(R, r2, r1x2, ctx)

is calling BN_mod_inverse twice over the same arguments, so if inverse is leaky in an exploitable way, it's a problem already, and this change doesn't make the situation worse.

The second change is much harder to say if it makes the code more vulnerable or not.

Nov 02 '22 11:11 tomato42

Do you think it would be useful to have a public symbol BN_gcd_isone()?

Nov 02 '22 17:11 kroeckx

Not sure if anyone noticed the comment related to the default provider. i.e the multi prime keygen already uses the inverse instead of BN_gcd() - so this has been used like this for quite a long time.

Nov 03 '22 02:11 slontis

Do you think it would be useful to have a public symbol BN_gcd_isone()?

I think we should expose it as a BN_are_coprime() call.

Nov 06 '22 05:11 paulidale

What am I supposed to do with the tmp inv param? I was trying to avoid allocating it internally? (add it as an optional last param?)

Nov 07 '22 07:11 slontis

BN_mod_inverse is allocating a pile of BIGNUMs internal, I doubt one more will hurt any.

Nov 07 '22 21:11 paulidale

Needs rebase or putting the libcrypto.num entry at some more clever place :grin:

Nov 09 '22 07:11 t8m

This pull request is ready to merge

Nov 17 '22 16:11 openssl-machine

@paulidale please reconfirm

Nov 18 '22 13:11 t8m

This pull request is ready to merge

Nov 19 '22 22:11 openssl-machine

Merged to master and 3.1 branches (after fixing the version in libcrypto.num entry). Thank you.

Nov 21 '22 10:11 t8m

Improve FIPS RSA keygen performance - change BN_gcd() test

Checklist