[3.6] Reduce performance regression in RSA public operations
Description
Fix #9232 - partially, the low-hanging fruits (also those that don't make the code size go up by too much).
Status: work in progress.
TODO:
- [ ] measure code size
- [ ] measure perf on other platforms
- [ ] currently the window size is based on the limbsize of E, should it use the actual bitsize instead when E is public?
- [ ] decide on API
- [ ] propagate to RSA functions
PR checklist
- [ ] changelog provided, or not required
- [ ] 3.6 backport this is it, but needs forward-porting
- [x] 2.28 backport not required
- [ ] tests provided, or not required
Initial performance measurements on my laptop (64-bit Intel):
3.6.0
1024/1024-bit exp_mod, cold r: 0.961081 ms
1024/1024-bit exp_mod, hot r: 1.050500 ms
17/1024-bit exp_mod, cold r: 0.100309 ms
17/1024-bit exp_mod, hot r: 0.177438 ms
This PR
1024/1024-bit exp_mod, cold r: 1.070629 ms
1024/1024-bit exp_mod, hot r: 1.062271 ms
17/1024-bit exp_mod, cold r: 0.034083 ms
17/1024-bit exp_mod, hot r: 0.027739 ms
3.5.1
1024/1024-bit exp_mod, cold r: 1.071714 ms
1024/1024-bit exp_mod, hot r: 1.083624 ms
17/1024-bit exp_mod, cold r: 0.022744 ms
17/1024-bit exp_mod, hot r: 0.016366 ms
Ok (with a few minor remarks) if we really have to have mixed-leakiness functions, but do we really have to do that?
No, I don't think we absolutely have to do that, and that's one of the main points about which I was hoping to get feedback. I just has to make a temporary choice for prototyping.
One opinion about the API: maybe it would be worth to keep the
_optionally_safe()function static and have an_unsafe()version in the header instead.
Yes, I think that would already be better.
I don't like the complexity of having functions whose security properties depend on a runtime argument.
Agreed. Do you think it's acceptable for static functions though or would you rather avoid it entirely?
The only argument I can think of to merge the two is code size. Is the gain really worth it?
That, and also there can be a maintenance cost if we end up duplicating source code. I can give it a try to do some size measurements and check how easy or awkward it is to avoid duplication.
I don't like the complexity of having functions whose security properties depend on a runtime argument.
Agreed. Do you think it's acceptable for static functions though or would you rather avoid it entirely?
I'd prefer to avoid it, but for smaller functions with a limited scope, it's not so bad.
With this PR (and the necessary changes to make use of the E_public parameter), the OP-TEE test case runs faster although still much slower than with v3.5.2. This is because the test doesn't only run RSA signature verification and decryption, it also includes RSA signing and encryption. Assuming OP-TEE switches to the more secure algorithm for secret stuff, we may need to disable this test in the CI and run it only during the release tests.
Here are the numbers (running time xtest 4011 on the arm32 QEMU target):
- v3.5.2 (optee_os master @ b339ffbd9): 14.5s
- v3.6.0: 8m 26.6s
- This PR (my branch: test-mbedtls-expmod-regression-fix-9281 @ 9dca1658f): 5m 58s
Just checking: @gilles-peskine-arm @yanesca will you be volunteering to review this once I've updated it, or should I look for another reviewer? (I'd appreciate if at least one of you could give at least a design review.)
I intend to review if I'm around (I'll take about a week off around 15 August).
I am happy to review it as well, but I will be taking some days off too (this Friday and next Wednesday).
There is still work needed, just pushing what I have so far to get an early CI overnight.
If we're going to have a single function for both cases, I would prefer a more specific naming convention than
optionally_safe. I don't like the use of “safe”, which could refer to many different things. I'd prefer leaky or “public”.
How about using ct instead of safe?
We could also put the emphasis in the other direction: can_skip_ct.
For the record, here's what I used for the above benchmarking results:
#include "mbedtls/bignum.h"
#include "bignum_core.h"
#include <stdio.h>
#include <time.h>
#define TIMEIT(TITLE, CODE) \
do { \
for (size_t i = 0; i < 1000; i++) { \
CODE; \
} \
clock_t start = clock(); \
for (size_t i = 0; i < 1000; i++) { \
CODE; \
} \
clock_t end = clock(); \
double duration = (double)(end - start) / CLOCKS_PER_SEC; \
printf("%s: %f ms\n", TITLE, duration); \
} while (0)
// hex(random.randrange(2 ** 1024))
const char *A = "49dd4a46f532d1e3c395875d0f43747367e1fe078ac571a541e17c1e1a505cbb9df94e8fbe4b6acee05134a3ccb8780cbf19abded6dee43af23c547046ab309090eccfd8d837ce33b7489421c9a93201b7598f32bf4d97de5302604d581e31e1d1f786f826d00678c62222d0ed97910a786a9ca2a24cd768ee4241e761d71494";
const char *D = "8e51ee5226332eea3e694e62919fe8a3e2c6be653cd63d5c87027848636293dfe42025ab64df0f75cacab381400e4b734b08b4cbb1c4a28f699ae81dab87838f501f090e72e5d546f8de82f273416c204fa66176c0b48aea5df5cb081b80b56e63e3bfa545157b259baeebcdbf50b7299953440a681f414ea4cfca9d8f2c0b91";
const char *E = "10001";
const char *N = "ff3f5df10b8db6aab53eb55c3c979ce9250287fb48ac031b650ecbe35a42b2f0d2edfdb2252023e0b5769574ba0d6e7a5dec6989b75c82bc0364a0f24c7e3acdd12720573b8bdbd879d65cf3d8fb7ae9324774bb910aac64dc9f233ff58472809ec260089ef66b304368e86b1d159ce330867c3c49758757305f3de52bf958a7";
int main(void) {
mbedtls_mpi x, a, d, e, n, r;
mbedtls_mpi_init(&x); mbedtls_mpi_init(&a); mbedtls_mpi_init(&e);
mbedtls_mpi_init(&d); mbedtls_mpi_init(&n); mbedtls_mpi_init(&r);
mbedtls_mpi_read_string(&a, 16, A);
mbedtls_mpi_read_string(&d, 16, D);
mbedtls_mpi_read_string(&e, 16, E);
mbedtls_mpi_read_string(&n, 16, N);
TIMEIT("1024/1024-bit exp_mod, cold r", mbedtls_mpi_free(&r); mbedtls_mpi_exp_mod(&x, &a, &d, &n, &r));
TIMEIT("1024/1024-bit exp_mod, hot r", mbedtls_mpi_exp_mod(&x, &a, &d, &n, &r));
TIMEIT(" 17/1024-bit exp_mod, cold r", mbedtls_mpi_free(&r); mbedtls_mpi_exp_mod_optionally_safe(&x, &a, &e, &n, &r, MBEDTLS_MPI_IS_PUBLIC));
TIMEIT(" 17/1024-bit exp_mod, hot r", mbedtls_mpi_exp_mod_optionally_safe(&x, &a, &e, &n, &r, MBEDTLS_MPI_IS_PUBLIC));
}
How about using ct instead of safe?
We could also put the emphasis in the other direction: can_skip_ct.
I chose the prefix following the existing mbedtls_mpi_core_get_mont_r2_unsafe() funtion. It didn't occur to me to tie it to the _ct prefix. Conceptually this seems neat and leaky is more precise, but if I think about it I probably like _unsafe better:
- In the context of
bignum_corewe don't always use_ctbut even where we do will go away over time as being constant time will become default. - I think it is more likely that people will consult the documentation and start to think if they read
optionally_unsafeas opposed tocan_skip_ctoroptionally_leaky
What do you think?
How about using ct instead of safe? We could also put the emphasis in the other direction: can_skip_ct.
I chose the prefix following the existing
mbedtls_mpi_core_get_mont_r2_unsafe()funtion. It didn't occur to me to tie it to the _ct prefix. Conceptually this seems neat andleakyis more precise, but if I think about it I probably like_unsafebetter:
- In the context of
bignum_corewe don't always use_ctbut even where we do will go away over time as being constant time will become default.- I think it is more likely that people will consult the documentation and start to think if they read
optionally_unsafeas opposed tocan_skip_ctoroptionally_leakyWhat do you think?
Yes, generally I prefer _unsafe, since it screams "here be dragons! read the docs!" - and this is one place where I'd rather slow down someone's reading and comprehension for security. I gave the other suggestions to offer alternatives.
That said, I find the suffix _optionally_unsafe very weird. I wonder about something like _may_be_unsafe but that's up to you. I realise "optionally" more clearly captures that it's the caller who chooses, but again, maybe this is a place where we want to make people think twice when reading/using these functions.
I think it is more likely that people will consult the documentation and start to think if they read optionally_unsafe as opposed to can_skip_ct or optionally_leaky
I doubt that. In languages like Haskell and Rust, “unsafe” has an aura of “what you need to get work done” and I think people just use it without trying to understand what subtle rule they're violating. Like casts in C. At least “leaky” tells you what sort of rule you're violating. I can't parse “can skip ct” (well, I can because I know what it's supposed to mean, but my first impression is that it can skip something implicit, and also it is constant-time).
I suggest we stick with the current naming for now. What Gilles said applies here as well:
Since it's not in the public API, we can look for a better way after 3.6.1.
The PR is ready for review.
The CI doesn't look happy.
The CI doesn't look happy.
There are (at least) two kind of issues:
- In
component_test_mx32at lot (326) of the generated test cases formbedtls_mpi_core_exp_mod()are failing, all on the same assertion:suites/test_suite_bignum_core.function:1277:0 == memcmp(X, A, N_limbs * sizeof(mbedtls_mpi_uint))- so presumably no computed the correct result. - Several components are failing to build with
multiple definition of `mbedtls_mpi_optionally_safe_codepath'. I guess we want to declare itextern intin the.hfile and only define it once in the.cfile.
The CI doesn't look happy.
Oh, sorry, I forgot to check back for the results.
Regarding testing with test hooks, I assume the reason is that we don't have valgrind/memsan testing of constant-time behaviour of mbedtls_mpi_core_exp_mod() in 3.6?
I didn't realise that we don't have the CT tests in 3.6 and my rationale was adding an extra safeguard, meant to assure that the two codepaths are kept separate. I mean the goal would be that there are well documented and marked sections in the code for the two codepaths even if temporarily or accidentally for a set of inputs the _unsafe() path happens to be constant time too. I admit that these tests doesn't enforce that, just was thought that it is better than nothing.
I believe it landed in development in the meantime, so I assume the development version of this PR wouldn't have to do that and would just rely on valgrind/memsan testing for mbedtls_mpi_core_exp_mod()?
In principle we could drop them in development if they are no objections. There is a complication though: it is still in restricted so we have a several options:
- use some ad-hoc process to publish these tests now
- after 3.6.1 is released we merge the changes from restricted to the public development branch and hold off the forward port until then
- we merge the forward port to restricted now and consider publishing the changes from restricted after 3.6.1 is out to remove the hooks
- merge the forward port with the test hooks for now and remove them when the CT tests have landed on development some other way
we don't seem to be doing the hooks-based testing on mbedtls_mpi_exp_mod() / mbedtls_mpi_exp_mod_unsafe()
I thought that the CT tests were already present and the test hooks felt a bit like an overkill to begin with and didn't want to waste CI cycles on it. But now that we know that we don't have the CT tests, it is probably better to add them.
which could be left for a follow-up however
You mean for after 3.6.1?
For code size, I took the difference between mbedtls-3.6 @ 28cdd1190 and this PR rebased onto that commit (since that's what will happen when this is merged).
In the default config, building for AArch64 (in a VM on Apple Silicon), the code size increase with clang-15 is
| base:mbedtls-3.6:28cdd1190 | pr:9281:rebased:dd9609174 | Delta | Filename |
|---|---|---|---|
| 22858, 0 | 22982, 0 | +124+0 | bignum.o |
| 9104, 0 | 9440, 0 | +336+0 | bignum_core.o |
| +460+0 | TOTAL |
(Full config gives the same change)
Building for TF-M gives no change, since that config uses p256-m and don't use bignum*.
Building for Thumb2 with minimum changes from default to get it to build (+MBEDTLS_NO_PLATFORM_ENTROPY-MBEDTLS_FS_IO-MBEDTLS_HAVE_TIME-MBEDTLS_HAVE_TIME_DATE-MBEDTLS_NET_C-MBEDTLS_PSA_CRYPTO_STORAGE_C-MBEDTLS_PSA_ITS_FILE_C-MBEDTLS_TIMING_C) and armclang-6.19
| base:mbedtls-3.6:28cdd1190 | pr:9281:rebased:dd9609174 | Delta | Filename |
|---|---|---|---|
| 8866, 0 | 8944, 0 | +78+0 | bignum.o |
| 2578, 0 | 2730, 0 | +152+0 | bignum_core.o |
| +230+0 | TOTAL |
For x86_64 on my Linux machine (separate rebase, but same outcome) with clang-10 (as that's what's installed by default there)
| base:mbedtls-3.6:28cdd1190 | pr:9281:rebased:23675570a | Delta | Filename |
|---|---|---|---|
| 28556, 0 | 28710, 0 | +154+0 | bignum.o |
| 10883, 0 | 11139, 0 | +256+0 | bignum_core.o |
| +410+0 | TOTAL |
You mean for after 3.6.1?
I mean if I have to choose between merging this PR without the extra tests or have it miss the code freeze deadline, I prefer to merge it without the extra tests. But if merging it with the extra tests is also on the table, that's of course my favourite option :)
I raised a separate PR for the additional testing: https://github.com/Mbed-TLS/mbedtls/pull/9493
Performance measurements, in default config (desktop only, sorry)
aarch64:
| Test | base:mbedtls-3.6:28cdd1190 | pr:9281:rebased:dd9609174 | Relative |
|---|---|---|---|
| 1024/1024-bit exp_mod, cold r | 0.449740 ms | 0.445420 ms | 1.01x |
| 1024/1024-bit exp_mod, hot r | 0.444260 ms | 0.442480 ms | 1.00x |
| 17/1024-bit exp_mod, cold r | 0.044070 ms | 0.014440 ms | 3.05x |
| 17/1024-bit exp_mod, hot r | 0.041390 ms | 0.011440 ms | 3.62x |
x86_64:
| Test | base:mbedtls-3.6:28cdd1190 | pr:9281:rebased:23675570a | Relative |
|---|---|---|---|
| 1024/1024-bit exp_mod, cold r | 0.672410 ms | 0.653440 ms | 1.03x |
| 1024/1024-bit exp_mod, hot r | 0.670990 ms | 0.649560 ms | 1.03x |
| 17/1024-bit exp_mod, cold r | 0.066270 ms | 0.020480 ms | 3.24x |
| 17/1024-bit exp_mod, hot r | 0.062450 ms | 0.016760 ms | 3.73x |
I raised a separate PR for the additional testing: #9493
It looks like you also pushed the additional tests here, which I guess wasn't intended since you created a separate PR for them, but is absolutely fine by me. Now that they're here, let's review them and hopefully get the whole thing merged - which as I was saying was my favourite outcome anyway :)
It looks like you also pushed the additional tests here, which I guess wasn't intended since you created a separate PR for them, but is absolutely fine by me.
Oh, I am sorry, that was an accident, I probably created the new branch too late and didn't notice.
At least one of the test failures is relevant, e.g. tls-testing / all_u16-build_arm_linux_gnueabi_gcc_arm5vte
[2024-08-22T07:44:40.471Z] CC test_suite_rsa.c
[2024-08-22T07:44:40.729Z] suites/test_suite_rsa.function: In function 'test_mbedtls_rsa_public':
[2024-08-22T07:44:40.729Z] suites/test_suite_rsa.function:493:5: error: implicit declaration of function 'mbedtls_mpi_optionally_safe_codepath_reset' [-Werror=implicit-function-declaration]
[2024-08-22T07:44:40.729Z] mbedtls_mpi_optionally_safe_codepath_reset();
[2024-08-22T07:44:40.729Z] ^
[2024-08-22T07:44:40.729Z] In file included from suites/helpers.function:7:0:
[2024-08-22T07:44:40.729Z] suites/test_suite_rsa.function:497:16: error: 'mbedtls_mpi_optionally_safe_codepath' undeclared (first use in this function)
[2024-08-22T07:44:40.729Z] TEST_EQUAL(mbedtls_mpi_optionally_safe_codepath, MBEDTLS_MPI_IS_PUBLIC);
[2024-08-22T07:44:40.729Z] ^
[2024-08-22T07:44:40.729Z] ./include/test/macros.h:74:56: note: in definition of macro 'TEST_EQUAL'
[2024-08-22T07:44:40.729Z] (unsigned long long) (expr1), (unsigned long long) (expr2))) \
[2024-08-22T07:44:40.729Z] ^
[2024-08-22T07:44:40.729Z] suites/test_suite_rsa.function:497:16: note: each undeclared identifier is reported only once for each function it appears in
[2024-08-22T07:44:40.729Z] TEST_EQUAL(mbedtls_mpi_optionally_safe_codepath, MBEDTLS_MPI_IS_PUBLIC);
[2024-08-22T07:44:40.729Z] ^
[2024-08-22T07:44:40.729Z] ./include/test/macros.h:74:56: note: in definition of macro 'TEST_EQUAL'
[2024-08-22T07:44:40.729Z] (unsigned long long) (expr1), (unsigned long long) (expr2))) \
[2024-08-22T07:44:40.729Z] ^
[2024-08-22T07:44:40.729Z] suites/test_suite_rsa.function:497:54: error: 'MBEDTLS_MPI_IS_PUBLIC' undeclared (first use in this function)
[2024-08-22T07:44:40.729Z] TEST_EQUAL(mbedtls_mpi_optionally_safe_codepath, MBEDTLS_MPI_IS_PUBLIC);
[2024-08-22T07:44:40.729Z] ^
[2024-08-22T07:44:40.729Z] ./include/test/macros.h:74:86: note: in definition of macro 'TEST_EQUAL'
[2024-08-22T07:44:40.729Z] (unsigned long long) (expr1), (unsigned long long) (expr2))) \
[2024-08-22T07:44:40.729Z] ^
[2024-08-22T07:44:40.729Z] suites/test_suite_rsa.function: In function 'test_mbedtls_rsa_private':
[2024-08-22T07:44:40.729Z] suites/test_suite_rsa.function:570:20: error: 'mbedtls_mpi_optionally_safe_codepath' undeclared (first use in this function)
[2024-08-22T07:44:40.729Z] TEST_EQUAL(mbedtls_mpi_optionally_safe_codepath, MBEDTLS_MPI_IS_SECRET);
[2024-08-22T07:44:40.729Z] ^
[2024-08-22T07:44:40.729Z] ./include/test/macros.h:74:56: note: in definition of macro 'TEST_EQUAL'
[2024-08-22T07:44:40.729Z] (unsigned long long) (expr1), (unsigned long long) (expr2))) \
[2024-08-22T07:44:40.729Z] ^
[2024-08-22T07:44:40.729Z] suites/test_suite_rsa.function:570:58: error: 'MBEDTLS_MPI_IS_SECRET' undeclared (first use in this function)
[2024-08-22T07:44:40.729Z] TEST_EQUAL(mbedtls_mpi_optionally_safe_codepath, MBEDTLS_MPI_IS_SECRET);
[2024-08-22T07:44:40.729Z] ^
[2024-08-22T07:44:40.729Z] ./include/test/macros.h:74:86: note: in definition of macro 'TEST_EQUAL'
[2024-08-22T07:44:40.729Z] (unsigned long long) (expr1), (unsigned long long) (expr2))) \
[2024-08-22T07:44:40.729Z] ^
[2024-08-22T07:44:40.729Z] cc1: all warnings being treated as errors
Thanks - this is looking good to me, but I have two remaining comments:
-
I think we need a ChangeLog entry - this is very definitely user-visible, even if it's an NFR;
-
While there's already a big comment around the definition of
MBEDTLS_MPI_IS_PUBLIC, I think we should say why we have such a "weird" value. I know that is documented above in this PR, but that is really something that should be in the code "Use a value that is unlikely to happen by accident, but which can be used as an immediate value in a Thumb2 comparison (for code size)"
The Windows CI failures are test failures:
[2024-08-22T09:08:34.230Z] Base test mbedtls_mpi_exp_mod #1 .................................. PASS
[2024-08-22T09:08:34.230Z] Base test mbedtls_mpi_exp_mod #2 (Even N) ......................... FAILED
[2024-08-22T09:08:34.230Z] mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z] at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z] lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z] rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z] Base test mbedtls_mpi_exp_mod #2 (N = 0 (null)) ................... FAILED
[2024-08-22T09:08:34.230Z] mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z] at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z] lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z] rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z] Base test mbedtls_mpi_exp_mod #3 (Negative N) ..................... FAILED
[2024-08-22T09:08:34.230Z] mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z] at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z] lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z] rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z] Base test mbedtls_mpi_exp_mod #4 (Negative base) .................. PASS
[2024-08-22T09:08:34.230Z] Base test mbedtls_mpi_exp_mod #5 (Negative exponent) .............. FAILED
[2024-08-22T09:08:34.230Z] mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z] at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z] lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z] rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z] Base test mbedtls_mpi_exp_mod #6 (Negative base + exponent) ....... FAILED
[2024-08-22T09:08:34.230Z] mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z] at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z] lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z] rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: 0 (null) ^ 0 (null) mod 9 ............... FAILED
[2024-08-22T09:08:34.230Z] mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z] at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z] lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z] rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: 0 (null) ^ 0 (1 limb) mod 9 ............. PASS
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: 0 (null) ^ 1 mod 9 ...................... PASS
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: 0 (null) ^ 2 mod 9 ...................... PASS
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: 0 (1 limb) ^ 0 (null) mod 9 ............. FAILED
[2024-08-22T09:08:34.230Z] mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z] at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z] lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z] rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: 0 (1 limb) ^ 0 (1 limb) mod 9 ........... PASS
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: 0 (1 limb) ^ 1 mod 9 .................... PASS
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: 0 (1 limb) ^ 2 mod 9 .................... PASS
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: 1 ^ 0 (null) mod 9 ...................... FAILED
[2024-08-22T09:08:34.230Z] mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z] at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z] lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z] rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: 4 ^ 0 (null) mod 9 ...................... FAILED
[2024-08-22T09:08:34.230Z] mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z] at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z] lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z] rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: 10 ^ 0 (null) mod 9 ..................... FAILED
[2024-08-22T09:08:34.230Z] mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z] at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z] lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z] rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: 1 ^ 0 (1 limb) mod 9 .................... PASS
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: 4 ^ 0 (1 limb) mod 9 .................... PASS
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: 10 ^ 0 (1 limb) mod 9 ................... PASS
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: -3 ^ 3 mod 27 ........................... PASS
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: MAX_SIZE exponent ....................... FAILED
[2024-08-22T09:08:34.230Z] mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z] at line 1128, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z] lhs = 0x000000002a2a2a2a = 707406378
[2024-08-22T09:08:34.230Z] rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: MAX_SIZE + 1 exponent ................... FAILED
[2024-08-22T09:08:34.230Z] mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z] at line 1120, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z] lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z] rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: MAX_SIZE modulus ........................ FAILED
[2024-08-22T09:08:34.230Z] mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z] at line 1128, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z] lhs = 0x000000002a2a2a2a = 707406378
[2024-08-22T09:08:34.230Z] rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: MAX_SIZE + 1 modulus .................... FAILED
[2024-08-22T09:08:34.230Z] mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z] at line 1120, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z] lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z] rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: MAX_SIZE exponent and modulus ........... FAILED
[2024-08-22T09:08:34.230Z] mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z] at line 1128, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z] lhs = 0x000000002a2a2a2a = 707406378
[2024-08-22T09:08:34.230Z] rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod: MAX_SIZE + 1 exponent and modulus ....... FAILED
[2024-08-22T09:08:34.230Z] mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z] at line 1120, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z] lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z] rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod #1 ....................................... PASS
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod (Negative base) [#1] ..................... PASS
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod (Negative base) [#2] ..................... PASS
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod (N.n=3, RR.n=1 on 32 bit) ................ PASS
[2024-08-22T09:08:34.230Z] Test mbedtls_mpi_exp_mod (N.n=3, RR.n=1 on 64 bit) ................ ----
[2024-08-22T09:08:34.230Z] Unmet dependencies: 0
Here are measurements I did on 32-bit Arm (Cortex-A7, Pi 2):
| 3.5.1 | 3.6.0 | 9281 | |||
|---|---|---|---|---|---|
| 1024/1024-bit exp_mod, cold r | 26.73 ms | 25.84 ms | x 0.96 | 25.47 ms | x 0.95 |
| 1024/1024-bit exp_mod, hot r | 26.52 ms | 25.40 ms | x 0.96 | 25.25 ms | x 0.95 |
| 17/1024-bit exp_mod, cold r | 0.62 ms | 1.43 ms | x 2.31 | 0.87 ms | x 1.56 |
| 17/1024-bit exp_mod, hot r | 0.42 ms | 1.22 ms | x 2.90 | 0.66 ms | x 1.57 |
| '17'/1024-bit exp_mod, cold r | 0.64 ms | 25.61 ms | x 40 | 0.87 ms | x 1.56 |
| '17'/1024-bit exp_mod, hot r | 0.43 ms | 25.40 ms | x 60 | 0.66 ms | x 1.57 |
Using a slightly modified version of the program above where the '17'/1024 lines are the same as 17/1024 except with mbedtls_mpi_grow(&e, d.private_n) inserted in-between.
All ratios in the above table are with 3.5.1 as the reference value. So, basically public operations after this PR would be about 50-60% slower than they were in 3.5.1, which I think is quite acceptable - in any case a vast improvement over the situation in 3.6.0.