mbedtls [3.6] Reduce performance regression in RSA public operations

Description

Fix #9232 - partially, the low-hanging fruits (also those that don't make the code size go up by too much).

Status: work in progress.

TODO:

[ ] measure code size
[ ] measure perf on other platforms
[ ] currently the window size is based on the limbsize of E, should it use the actual bitsize instead when E is public?
[ ] decide on API
[ ] propagate to RSA functions

PR checklist

[ ] changelog provided, or not required
[ ] 3.6 backport this is it, but needs forward-porting
[x] 2.28 backport not required
[ ] tests provided, or not required

Jun 18 '24 11:06 mpg

Initial performance measurements on my laptop (64-bit Intel):

3.6.0

1024/1024-bit exp_mod, cold r: 0.961081 ms
1024/1024-bit exp_mod,  hot r: 1.050500 ms
  17/1024-bit exp_mod, cold r: 0.100309 ms
  17/1024-bit exp_mod,  hot r: 0.177438 ms

This PR

1024/1024-bit exp_mod, cold r: 1.070629 ms
1024/1024-bit exp_mod,  hot r: 1.062271 ms
  17/1024-bit exp_mod, cold r: 0.034083 ms
  17/1024-bit exp_mod,  hot r: 0.027739 ms

3.5.1

1024/1024-bit exp_mod, cold r: 1.071714 ms
1024/1024-bit exp_mod,  hot r: 1.083624 ms
  17/1024-bit exp_mod, cold r: 0.022744 ms
  17/1024-bit exp_mod,  hot r: 0.016366 ms

Edit: code used for benchmarking

Jun 18 '24 11:06 mpg

Ok (with a few minor remarks) if we really have to have mixed-leakiness functions, but do we really have to do that?

No, I don't think we absolutely have to do that, and that's one of the main points about which I was hoping to get feedback. I just has to make a temporary choice for prototyping.

One opinion about the API: maybe it would be worth to keep the _optionally_safe() function static and have an _unsafe() version in the header instead.

Yes, I think that would already be better.

I don't like the complexity of having functions whose security properties depend on a runtime argument.

Agreed. Do you think it's acceptable for static functions though or would you rather avoid it entirely?

The only argument I can think of to merge the two is code size. Is the gain really worth it?

That, and also there can be a maintenance cost if we end up duplicating source code. I can give it a try to do some size measurements and check how easy or awkward it is to avoid duplication.

Jul 22 '24 11:07 mpg

I don't like the complexity of having functions whose security properties depend on a runtime argument.

Agreed. Do you think it's acceptable for static functions though or would you rather avoid it entirely?

I'd prefer to avoid it, but for smaller functions with a limited scope, it's not so bad.

Jul 22 '24 11:07 gilles-peskine-arm

With this PR (and the necessary changes to make use of the E_public parameter), the OP-TEE test case runs faster although still much slower than with v3.5.2. This is because the test doesn't only run RSA signature verification and decryption, it also includes RSA signing and encryption. Assuming OP-TEE switches to the more secure algorithm for secret stuff, we may need to disable this test in the CI and run it only during the release tests. Here are the numbers (running time xtest 4011 on the arm32 QEMU target):

v3.5.2 (optee_os master @ b339ffbd9): 14.5s
v3.6.0: 8m 26.6s
This PR (my branch: test-mbedtls-expmod-regression-fix-9281 @ 9dca1658f): 5m 58s

Jul 23 '24 13:07 jforissier

Just checking: @gilles-peskine-arm @yanesca will you be volunteering to review this once I've updated it, or should I look for another reviewer? (I'd appreciate if at least one of you could give at least a design review.)

Aug 06 '24 10:08 mpg

I intend to review if I'm around (I'll take about a week off around 15 August).

Aug 06 '24 11:08 gilles-peskine-arm

I am happy to review it as well, but I will be taking some days off too (this Friday and next Wednesday).

Aug 06 '24 11:08 yanesca

There is still work needed, just pushing what I have so far to get an early CI overnight.

Aug 12 '24 19:08 yanesca

If we're going to have a single function for both cases, I would prefer a more specific naming convention than optionally_safe. I don't like the use of “safe”, which could refer to many different things. I'd prefer leaky or “public”.

How about using ct instead of safe?

We could also put the emphasis in the other direction: can_skip_ct.

Aug 14 '24 06:08 tom-cosgrove-arm

For the record, here's what I used for the above benchmarking results:

#include "mbedtls/bignum.h"
#include "bignum_core.h"
#include <stdio.h>
#include <time.h>

#define TIMEIT(TITLE, CODE)                                         \
do {                                                                \
    for (size_t i = 0; i < 1000; i++) {                             \
        CODE;                                                       \
    }                                                               \
    clock_t start = clock();                                        \
    for (size_t i = 0; i < 1000; i++) {                             \
        CODE;                                                       \
    }                                                               \
    clock_t end = clock();                                          \
    double duration = (double)(end - start) / CLOCKS_PER_SEC;       \
    printf("%s: %f ms\n", TITLE, duration);                         \
} while (0)


// hex(random.randrange(2 ** 1024))
const char *A = "49dd4a46f532d1e3c395875d0f43747367e1fe078ac571a541e17c1e1a505cbb9df94e8fbe4b6acee05134a3ccb8780cbf19abded6dee43af23c547046ab309090eccfd8d837ce33b7489421c9a93201b7598f32bf4d97de5302604d581e31e1d1f786f826d00678c62222d0ed97910a786a9ca2a24cd768ee4241e761d71494";
const char *D = "8e51ee5226332eea3e694e62919fe8a3e2c6be653cd63d5c87027848636293dfe42025ab64df0f75cacab381400e4b734b08b4cbb1c4a28f699ae81dab87838f501f090e72e5d546f8de82f273416c204fa66176c0b48aea5df5cb081b80b56e63e3bfa545157b259baeebcdbf50b7299953440a681f414ea4cfca9d8f2c0b91";
const char *E = "10001";
const char *N = "ff3f5df10b8db6aab53eb55c3c979ce9250287fb48ac031b650ecbe35a42b2f0d2edfdb2252023e0b5769574ba0d6e7a5dec6989b75c82bc0364a0f24c7e3acdd12720573b8bdbd879d65cf3d8fb7ae9324774bb910aac64dc9f233ff58472809ec260089ef66b304368e86b1d159ce330867c3c49758757305f3de52bf958a7";

int main(void) {
    mbedtls_mpi x, a, d, e, n, r;

    mbedtls_mpi_init(&x); mbedtls_mpi_init(&a); mbedtls_mpi_init(&e);
    mbedtls_mpi_init(&d); mbedtls_mpi_init(&n); mbedtls_mpi_init(&r);

    mbedtls_mpi_read_string(&a, 16, A);
    mbedtls_mpi_read_string(&d, 16, D);
    mbedtls_mpi_read_string(&e, 16, E);
    mbedtls_mpi_read_string(&n, 16, N);

    TIMEIT("1024/1024-bit exp_mod, cold r", mbedtls_mpi_free(&r); mbedtls_mpi_exp_mod(&x, &a, &d, &n, &r));
    TIMEIT("1024/1024-bit exp_mod,  hot r", mbedtls_mpi_exp_mod(&x, &a, &d, &n, &r));
    TIMEIT("  17/1024-bit exp_mod, cold r", mbedtls_mpi_free(&r); mbedtls_mpi_exp_mod_optionally_safe(&x, &a, &e, &n, &r, MBEDTLS_MPI_IS_PUBLIC));
    TIMEIT("  17/1024-bit exp_mod,  hot r", mbedtls_mpi_exp_mod_optionally_safe(&x, &a, &e, &n, &r, MBEDTLS_MPI_IS_PUBLIC));
}

Aug 14 '24 09:08 mpg

How about using ct instead of safe?

We could also put the emphasis in the other direction: can_skip_ct.

I chose the prefix following the existing mbedtls_mpi_core_get_mont_r2_unsafe() funtion. It didn't occur to me to tie it to the _ct prefix. Conceptually this seems neat and leaky is more precise, but if I think about it I probably like _unsafe better:

In the context of bignum_core we don't always use _ct but even where we do will go away over time as being constant time will become default.
I think it is more likely that people will consult the documentation and start to think if they read optionally_unsafe as opposed to can_skip_ct or optionally_leaky

What do you think?

Aug 15 '24 13:08 yanesca

How about using ct instead of safe? We could also put the emphasis in the other direction: can_skip_ct.

I chose the prefix following the existing mbedtls_mpi_core_get_mont_r2_unsafe() funtion. It didn't occur to me to tie it to the _ct prefix. Conceptually this seems neat and leaky is more precise, but if I think about it I probably like _unsafe better:

In the context of bignum_core we don't always use _ct but even where we do will go away over time as being constant time will become default.

I think it is more likely that people will consult the documentation and start to think if they read optionally_unsafe as opposed to can_skip_ct or optionally_leaky

What do you think?

Yes, generally I prefer _unsafe, since it screams "here be dragons! read the docs!" - and this is one place where I'd rather slow down someone's reading and comprehension for security. I gave the other suggestions to offer alternatives.

That said, I find the suffix _optionally_unsafe very weird. I wonder about something like _may_be_unsafe but that's up to you. I realise "optionally" more clearly captures that it's the caller who chooses, but again, maybe this is a place where we want to make people think twice when reading/using these functions.

Aug 15 '24 13:08 tom-cosgrove-arm

I think it is more likely that people will consult the documentation and start to think if they read optionally_unsafe as opposed to can_skip_ct or optionally_leaky

I doubt that. In languages like Haskell and Rust, “unsafe” has an aura of “what you need to get work done” and I think people just use it without trying to understand what subtle rule they're violating. Like casts in C. At least “leaky” tells you what sort of rule you're violating. I can't parse “can skip ct” (well, I can because I know what it's supposed to mean, but my first impression is that it can skip something implicit, and also it is constant-time).

Aug 15 '24 14:08 gilles-peskine-arm

I suggest we stick with the current naming for now. What Gilles said applies here as well:

Since it's not in the public API, we can look for a better way after 3.6.1.

Aug 15 '24 15:08 yanesca

The PR is ready for review.

Aug 15 '24 15:08 yanesca

The CI doesn't look happy.

Aug 20 '24 08:08 mpg

The CI doesn't look happy.

There are (at least) two kind of issues:

In component_test_mx32 at lot (326) of the generated test cases for mbedtls_mpi_core_exp_mod() are failing, all on the same assertion: suites/test_suite_bignum_core.function:1277:0 == memcmp(X, A, N_limbs * sizeof(mbedtls_mpi_uint)) - so presumably no computed the correct result.
Several components are failing to build with multiple definition of `mbedtls_mpi_optionally_safe_codepath'. I guess we want to declare it extern int in the .h file and only define it once in the .c file.

Aug 20 '24 08:08 mpg

The CI doesn't look happy.

Oh, sorry, I forgot to check back for the results.

Aug 20 '24 11:08 yanesca

Regarding testing with test hooks, I assume the reason is that we don't have valgrind/memsan testing of constant-time behaviour of mbedtls_mpi_core_exp_mod() in 3.6?

I didn't realise that we don't have the CT tests in 3.6 and my rationale was adding an extra safeguard, meant to assure that the two codepaths are kept separate. I mean the goal would be that there are well documented and marked sections in the code for the two codepaths even if temporarily or accidentally for a set of inputs the _unsafe() path happens to be constant time too. I admit that these tests doesn't enforce that, just was thought that it is better than nothing.

I believe it landed in development in the meantime, so I assume the development version of this PR wouldn't have to do that and would just rely on valgrind/memsan testing for mbedtls_mpi_core_exp_mod()?

In principle we could drop them in development if they are no objections. There is a complication though: it is still in restricted so we have a several options:

use some ad-hoc process to publish these tests now
after 3.6.1 is released we merge the changes from restricted to the public development branch and hold off the forward port until then
we merge the forward port to restricted now and consider publishing the changes from restricted after 3.6.1 is out to remove the hooks
merge the forward port with the test hooks for now and remove them when the CT tests have landed on development some other way

we don't seem to be doing the hooks-based testing on mbedtls_mpi_exp_mod() / mbedtls_mpi_exp_mod_unsafe()

I thought that the CT tests were already present and the test hooks felt a bit like an overkill to begin with and didn't want to waste CI cycles on it. But now that we know that we don't have the CT tests, it is probably better to add them.

Aug 20 '24 12:08 yanesca

which could be left for a follow-up however

You mean for after 3.6.1?

Aug 20 '24 13:08 yanesca

For code size, I took the difference between mbedtls-3.6 @ 28cdd1190 and this PR rebased onto that commit (since that's what will happen when this is merged).

In the default config, building for AArch64 (in a VM on Apple Silicon), the code size increase with clang-15 is

base:mbedtls-3.6:28cdd1190	pr:9281:rebased:dd9609174	Delta	Filename
22858, 0	22982, 0	+124+0	`bignum.o`
9104, 0	9440, 0	+336+0	`bignum_core.o`
		+460+0	TOTAL

(Full config gives the same change)

Building for TF-M gives no change, since that config uses p256-m and don't use bignum*.

Building for Thumb2 with minimum changes from default to get it to build (+MBEDTLS_NO_PLATFORM_ENTROPY-MBEDTLS_FS_IO-MBEDTLS_HAVE_TIME-MBEDTLS_HAVE_TIME_DATE-MBEDTLS_NET_C-MBEDTLS_PSA_CRYPTO_STORAGE_C-MBEDTLS_PSA_ITS_FILE_C-MBEDTLS_TIMING_C) and armclang-6.19

base:mbedtls-3.6:28cdd1190	pr:9281:rebased:dd9609174	Delta	Filename
8866, 0	8944, 0	+78+0	`bignum.o`
2578, 0	2730, 0	+152+0	`bignum_core.o`
		+230+0	TOTAL

For x86_64 on my Linux machine (separate rebase, but same outcome) with clang-10 (as that's what's installed by default there)

base:mbedtls-3.6:28cdd1190	pr:9281:rebased:23675570a	Delta	Filename
28556, 0	28710, 0	+154+0	`bignum.o`
10883, 0	11139, 0	+256+0	`bignum_core.o`
		+410+0	TOTAL

Aug 21 '24 07:08 tom-cosgrove-arm

You mean for after 3.6.1?

I mean if I have to choose between merging this PR without the extra tests or have it miss the code freeze deadline, I prefer to merge it without the extra tests. But if merging it with the extra tests is also on the table, that's of course my favourite option :)

Aug 21 '24 10:08 mpg

I raised a separate PR for the additional testing: https://github.com/Mbed-TLS/mbedtls/pull/9493

Aug 21 '24 14:08 yanesca

Performance measurements, in default config (desktop only, sorry)

aarch64:

Test	base:mbedtls-3.6:28cdd1190	pr:9281:rebased:dd9609174	Relative
1024/1024-bit exp_mod, cold r	0.449740 ms	0.445420 ms	1.01x
1024/1024-bit exp_mod, hot r	0.444260 ms	0.442480 ms	1.00x
17/1024-bit exp_mod, cold r	0.044070 ms	0.014440 ms	3.05x
17/1024-bit exp_mod, hot r	0.041390 ms	0.011440 ms	3.62x

x86_64:

Test	base:mbedtls-3.6:28cdd1190	pr:9281:rebased:23675570a	Relative
1024/1024-bit exp_mod, cold r	0.672410 ms	0.653440 ms	1.03x
1024/1024-bit exp_mod, hot r	0.670990 ms	0.649560 ms	1.03x
17/1024-bit exp_mod, cold r	0.066270 ms	0.020480 ms	3.24x
17/1024-bit exp_mod, hot r	0.062450 ms	0.016760 ms	3.73x

Aug 22 '24 06:08 tom-cosgrove-arm

I raised a separate PR for the additional testing: #9493

It looks like you also pushed the additional tests here, which I guess wasn't intended since you created a separate PR for them, but is absolutely fine by me. Now that they're here, let's review them and hopefully get the whole thing merged - which as I was saying was my favourite outcome anyway :)

Aug 22 '24 07:08 mpg

It looks like you also pushed the additional tests here, which I guess wasn't intended since you created a separate PR for them, but is absolutely fine by me.

Oh, I am sorry, that was an accident, I probably created the new branch too late and didn't notice.

Aug 22 '24 07:08 yanesca

At least one of the test failures is relevant, e.g. tls-testing / all_u16-build_arm_linux_gnueabi_gcc_arm5vte

[2024-08-22T07:44:40.471Z]   CC    test_suite_rsa.c
[2024-08-22T07:44:40.729Z] suites/test_suite_rsa.function: In function 'test_mbedtls_rsa_public':
[2024-08-22T07:44:40.729Z] suites/test_suite_rsa.function:493:5: error: implicit declaration of function 'mbedtls_mpi_optionally_safe_codepath_reset' [-Werror=implicit-function-declaration]
[2024-08-22T07:44:40.729Z]      mbedtls_mpi_optionally_safe_codepath_reset();
[2024-08-22T07:44:40.729Z]      ^
[2024-08-22T07:44:40.729Z] In file included from suites/helpers.function:7:0:
[2024-08-22T07:44:40.729Z] suites/test_suite_rsa.function:497:16: error: 'mbedtls_mpi_optionally_safe_codepath' undeclared (first use in this function)
[2024-08-22T07:44:40.729Z]      TEST_EQUAL(mbedtls_mpi_optionally_safe_codepath, MBEDTLS_MPI_IS_PUBLIC);
[2024-08-22T07:44:40.729Z]                 ^
[2024-08-22T07:44:40.729Z] ./include/test/macros.h:74:56: note: in definition of macro 'TEST_EQUAL'
[2024-08-22T07:44:40.729Z]                                   (unsigned long long) (expr1), (unsigned long long) (expr2)))                      \
[2024-08-22T07:44:40.729Z]                                                         ^
[2024-08-22T07:44:40.729Z] suites/test_suite_rsa.function:497:16: note: each undeclared identifier is reported only once for each function it appears in
[2024-08-22T07:44:40.729Z]      TEST_EQUAL(mbedtls_mpi_optionally_safe_codepath, MBEDTLS_MPI_IS_PUBLIC);
[2024-08-22T07:44:40.729Z]                 ^
[2024-08-22T07:44:40.729Z] ./include/test/macros.h:74:56: note: in definition of macro 'TEST_EQUAL'
[2024-08-22T07:44:40.729Z]                                   (unsigned long long) (expr1), (unsigned long long) (expr2)))                      \
[2024-08-22T07:44:40.729Z]                                                         ^
[2024-08-22T07:44:40.729Z] suites/test_suite_rsa.function:497:54: error: 'MBEDTLS_MPI_IS_PUBLIC' undeclared (first use in this function)
[2024-08-22T07:44:40.729Z]      TEST_EQUAL(mbedtls_mpi_optionally_safe_codepath, MBEDTLS_MPI_IS_PUBLIC);
[2024-08-22T07:44:40.729Z]                                                       ^
[2024-08-22T07:44:40.729Z] ./include/test/macros.h:74:86: note: in definition of macro 'TEST_EQUAL'
[2024-08-22T07:44:40.729Z]                                   (unsigned long long) (expr1), (unsigned long long) (expr2)))                      \
[2024-08-22T07:44:40.729Z]                                                                                       ^
[2024-08-22T07:44:40.729Z] suites/test_suite_rsa.function: In function 'test_mbedtls_rsa_private':
[2024-08-22T07:44:40.729Z] suites/test_suite_rsa.function:570:20: error: 'mbedtls_mpi_optionally_safe_codepath' undeclared (first use in this function)
[2024-08-22T07:44:40.729Z]          TEST_EQUAL(mbedtls_mpi_optionally_safe_codepath, MBEDTLS_MPI_IS_SECRET);
[2024-08-22T07:44:40.729Z]                     ^
[2024-08-22T07:44:40.729Z] ./include/test/macros.h:74:56: note: in definition of macro 'TEST_EQUAL'
[2024-08-22T07:44:40.729Z]                                   (unsigned long long) (expr1), (unsigned long long) (expr2)))                      \
[2024-08-22T07:44:40.729Z]                                                         ^
[2024-08-22T07:44:40.729Z] suites/test_suite_rsa.function:570:58: error: 'MBEDTLS_MPI_IS_SECRET' undeclared (first use in this function)
[2024-08-22T07:44:40.729Z]          TEST_EQUAL(mbedtls_mpi_optionally_safe_codepath, MBEDTLS_MPI_IS_SECRET);
[2024-08-22T07:44:40.729Z]                                                           ^
[2024-08-22T07:44:40.729Z] ./include/test/macros.h:74:86: note: in definition of macro 'TEST_EQUAL'
[2024-08-22T07:44:40.729Z]                                   (unsigned long long) (expr1), (unsigned long long) (expr2)))                      \
[2024-08-22T07:44:40.729Z]                                                                                       ^
[2024-08-22T07:44:40.729Z] cc1: all warnings being treated as errors

Aug 22 '24 08:08 tom-cosgrove-arm

Thanks - this is looking good to me, but I have two remaining comments:

I think we need a ChangeLog entry - this is very definitely user-visible, even if it's an NFR;
While there's already a big comment around the definition of MBEDTLS_MPI_IS_PUBLIC, I think we should say why we have such a "weird" value. I know that is documented above in this PR, but that is really something that should be in the code "Use a value that is unlikely to happen by accident, but which can be used as an immediate value in a Thumb2 comparison (for code size)"

Aug 22 '24 08:08 tom-cosgrove-arm

The Windows CI failures are test failures:

[2024-08-22T09:08:34.230Z]          Base test mbedtls_mpi_exp_mod #1 .................................. PASS
[2024-08-22T09:08:34.230Z]          Base test mbedtls_mpi_exp_mod #2 (Even N) ......................... FAILED
[2024-08-22T09:08:34.230Z]            mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z]            at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z]            lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z]            rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z]          Base test mbedtls_mpi_exp_mod #2 (N = 0 (null)) ................... FAILED
[2024-08-22T09:08:34.230Z]            mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z]            at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z]            lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z]            rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z]          Base test mbedtls_mpi_exp_mod #3 (Negative N) ..................... FAILED
[2024-08-22T09:08:34.230Z]            mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z]            at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z]            lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z]            rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z]          Base test mbedtls_mpi_exp_mod #4 (Negative base) .................. PASS
[2024-08-22T09:08:34.230Z]          Base test mbedtls_mpi_exp_mod #5 (Negative exponent) .............. FAILED
[2024-08-22T09:08:34.230Z]            mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z]            at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z]            lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z]            rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z]          Base test mbedtls_mpi_exp_mod #6 (Negative base + exponent) ....... FAILED
[2024-08-22T09:08:34.230Z]            mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z]            at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z]            lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z]            rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: 0 (null) ^ 0 (null) mod 9 ............... FAILED
[2024-08-22T09:08:34.230Z]            mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z]            at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z]            lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z]            rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: 0 (null) ^ 0 (1 limb) mod 9 ............. PASS
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: 0 (null) ^ 1 mod 9 ...................... PASS
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: 0 (null) ^ 2 mod 9 ...................... PASS
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: 0 (1 limb) ^ 0 (null) mod 9 ............. FAILED
[2024-08-22T09:08:34.230Z]            mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z]            at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z]            lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z]            rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: 0 (1 limb) ^ 0 (1 limb) mod 9 ........... PASS
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: 0 (1 limb) ^ 1 mod 9 .................... PASS
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: 0 (1 limb) ^ 2 mod 9 .................... PASS
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: 1 ^ 0 (null) mod 9 ...................... FAILED
[2024-08-22T09:08:34.230Z]            mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z]            at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z]            lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z]            rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: 4 ^ 0 (null) mod 9 ...................... FAILED
[2024-08-22T09:08:34.230Z]            mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z]            at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z]            lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z]            rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: 10 ^ 0 (null) mod 9 ..................... FAILED
[2024-08-22T09:08:34.230Z]            mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z]            at line 1033, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z]            lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z]            rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: 1 ^ 0 (1 limb) mod 9 .................... PASS
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: 4 ^ 0 (1 limb) mod 9 .................... PASS
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: 10 ^ 0 (1 limb) mod 9 ................... PASS
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: -3 ^ 3 mod 27 ........................... PASS
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: MAX_SIZE exponent ....................... FAILED
[2024-08-22T09:08:34.230Z]            mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z]            at line 1128, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z]            lhs = 0x000000002a2a2a2a = 707406378
[2024-08-22T09:08:34.230Z]            rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: MAX_SIZE + 1 exponent ................... FAILED
[2024-08-22T09:08:34.230Z]            mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z]            at line 1120, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z]            lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z]            rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: MAX_SIZE modulus ........................ FAILED
[2024-08-22T09:08:34.230Z]            mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z]            at line 1128, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z]            lhs = 0x000000002a2a2a2a = 707406378
[2024-08-22T09:08:34.230Z]            rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: MAX_SIZE + 1 modulus .................... FAILED
[2024-08-22T09:08:34.230Z]            mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z]            at line 1120, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z]            lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z]            rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: MAX_SIZE exponent and modulus ........... FAILED
[2024-08-22T09:08:34.230Z]            mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z]            at line 1128, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z]            lhs = 0x000000002a2a2a2a = 707406378
[2024-08-22T09:08:34.230Z]            rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod: MAX_SIZE + 1 exponent and modulus ....... FAILED
[2024-08-22T09:08:34.230Z]            mbedtls_mpi_optionally_safe_codepath == MBEDTLS_MPI_IS_SECRET
[2024-08-22T09:08:34.230Z]            at line 1120, C:/Windows/workspace/mbed-tls-pr-head_PR-9281-head/worktrees/tmpiq_0xjib/tests/suites/test_suite_bignum.function
[2024-08-22T09:08:34.230Z]            lhs = 0x0000000000000001 = 1
[2024-08-22T09:08:34.230Z]            rhs = 0x0000000000000000 = 0
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod #1 ....................................... PASS
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod (Negative base) [#1] ..................... PASS
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod (Negative base) [#2] ..................... PASS
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod (N.n=3, RR.n=1 on 32 bit) ................ PASS
[2024-08-22T09:08:34.230Z]          Test mbedtls_mpi_exp_mod (N.n=3, RR.n=1 on 64 bit) ................ ----
[2024-08-22T09:08:34.230Z]             Unmet dependencies: 0

Aug 22 '24 09:08 tom-cosgrove-arm

Here are measurements I did on 32-bit Arm (Cortex-A7, Pi 2):

	3.5.1	3.6.0		9281
1024/1024-bit exp_mod, cold r	26.73 ms	25.84 ms	x 0.96	25.47 ms	x 0.95
1024/1024-bit exp_mod, hot r	26.52 ms	25.40 ms	x 0.96	25.25 ms	x 0.95
17/1024-bit exp_mod, cold r	0.62 ms	1.43 ms	x 2.31	0.87 ms	x 1.56
17/1024-bit exp_mod, hot r	0.42 ms	1.22 ms	x 2.90	0.66 ms	x 1.57
'17'/1024-bit exp_mod, cold r	0.64 ms	25.61 ms	x 40	0.87 ms	x 1.56
'17'/1024-bit exp_mod, hot r	0.43 ms	25.40 ms	x 60	0.66 ms	x 1.57

Using a slightly modified version of the program above where the '17'/1024 lines are the same as 17/1024 except with mbedtls_mpi_grow(&e, d.private_n) inserted in-between.

All ratios in the above table are with 3.5.1 as the reference value. So, basically public operations after this PR would be about 50-60% slower than they were in 3.5.1, which I think is quite acceptable - in any case a vast improvement over the situation in 3.6.0.

Aug 22 '24 10:08 mpg