mbedtls icon indicating copy to clipboard operation
mbedtls copied to clipboard

myrand() in benchmark.c on big-endian causes rsa key generation to loop

Open jbsgh opened this issue 3 months ago • 11 comments

Summary

benchmark.c generates pseudo-random numbers by calling rand() in myrand(). These numbers are used to feed the RSA key-generator. mbedtls_mpi_gen_prime() checks that the most significant uint32_t is greater or equal than CEIL_MAXUINT_DIV_SQRT2. This requires at least a set MSB in X->p[n-1]. On big-endian machines this uint32 is effectively set by a call to rand() in myrand(). rand() produces pseudo-random int numbers in the range 0..RAND_MAX. Thus, the MSB is never set (rand() is int!) and the loop in mbedtls_mpi_gen_prime() cannot terminate when comparing the value as uint32_t.

myrand() in benchmark.c has to provide uint32_t random numbers over the full range instead of 0..RAND_MAX.

System information

Mbed TLS version: 3.6.4 Configuration: big endian, 32 Bit

Expected behavior

The benchmark test terminates after completing all tests.

Actual behavior

The benchmark test loops in RSA key generation and does not terminate at all.

Steps to reproduce

Start benchmark test on a machine with big endian.

Additional information

On little-endian machines the bignum X is converted to big endian before. The least significant byte of the rand()-call is shifted to the most significant position. The resulting value in the little-endian comparison may then fulfill the neccessary condition in mbedtls_mpi_gen_prime() to leave the loop.

jbsgh avatar Oct 01 '25 13:10 jbsgh

Thank you for your report and the detailed analysis! Indeed this looks like a silly bug in the benchmark program's myrand() function.

I think the following version of myrand() should work on all platforms:

static int myrand(void *rng_state, unsigned char *output, size_t len)
{
    (void) rng_state;
    while (len > 0) {
        *output = (unsigned char) rand();
        output += 1;
        len -= 1;
    }
    return 0;
}

I don't have easy access to a big-endian platform, could you test this on your platform to confirm?

mpg avatar Oct 02 '25 07:10 mpg

At first sight, the values look good. But there's a subtle new issue in your proposed implementation: The values cycle every 256 calls of rand(). Consequentially, myrand( NULL, buf, 128 ) toggles between two sets of data. If neither of them fulfills the condition in mbedtls_mpi_gen_prime() the RSA key generation still loops.

jbsgh avatar Oct 02 '25 09:10 jbsgh

The values cycle every 256 calls of rand().

I don't think they do. We're taking the low 8 bits of the output of rand() but that doesn't affect the fact that its internal state is larger. (Just the same way that if we only looked at the least significant bit, it wouldn't cycle every 2 calls.)

mpg avatar Oct 03 '25 10:10 mpg

But they do! Our C-runtime seems to use a simple LCG as rand() (I don't know how it's implemented on our embedded system in detail). Consider the following demo randtest.c:

#include <stdlib.h>
#include <stdio.h>

/* 
** Simple LCG implementation (TYPE_0) used by glibc 
** See https://github.com/lattera/glibc/blob/master/stdlib/random_r.c
*/
int myrand(void) {
    static int state = 1;
    return ( state = ( state * 1103515245 + 12345 ) & 0x7FFFFFFF );
}

void main( void )
{
    for ( int i = 0; i < 512; ++i )
        printf("%d\n", myrand() % 256 );    /* only least significant byte */
}

The program calls rand() 2 x 256 times and prints the returned value % 256 as suggested by your implementation. Compile it and run it. Check that there are 512 lines of output. Extract the first 256 values to randhead.txt, the last 256 values to randtail.txt. Check that both sequences are equal (e.g. with md5sum):

$ gcc -g -o randtest randtest.c 
$ ./randtest | wc -l
512
$ ./randtest | head -n 256 > randhead.txt
$ ./randtest | tail -n 256 > randtail.txt
$ md5sum randhead.txt randtail.txt
062b2b7c16bd5b9ba488efd958a814aa  randhead.txt
062b2b7c16bd5b9ba488efd958a814aa  randtail.txt

This test demonstrates that the values generated by your implementation may be repeated in quite short cycles. You're right that the generated sequence of pseudo-random numbers themselves is much longer, but your assumption doesn't hold for the sequence with values % 256.

The length of the cycle depends on the implementation of rand(). So don't be confused that the length of the cycle in this demo differs from the original length in my comment last week (https://github.com/Mbed-TLS/mbedtls/issues/10434#issuecomment-3360079963). I've implemented this demo on Linux instead of our real target.

Your suggested implementation in https://github.com/Mbed-TLS/mbedtls/issues/10434#issuecomment-3359577525 may repeat the generated random numbers in short cycles on some systems. If they don't meet the condition for the RSA keys the benchmark will still loop.

jbsgh avatar Oct 06 '25 06:10 jbsgh

Just the same way that if we only looked at the least significant bit, it wouldn't cycle every 2 calls.

For a decent random generator, sure, but rand() has a history of being implemented with simplicity and performance in mind, but not quality. LCG with poorly chosen parameters are common, and they have the defect that low-order bits have a very low period. “Shift out the lowest-order bit from rand() because it alternates at every call” used to be common advice in the 20th century.

I think we should stop using rand(). We already have issues for the self-tests and the unit tests (with a PR).

gilles-peskine-arm avatar Oct 06 '25 07:10 gilles-peskine-arm

Ok, thanks for the details @jbsgh ! I had tried on my system, but I guess my build of glibc must be using a more sophisticated RNG by default. It makes sense that embedded systems use very simple RNGs in order to minimize code size.

We're taking the low 8 bits of the output of rand() but that doesn't affect the fact that its internal state is larger.

So, the flaw in my reasoning is that for the kind of LCG used, the low-order n bits of the state only depend on the low-order n bits of the previous state. So the fact that the overall state is larger does nothing to save the low-order bits from having a short period. (To avoid that, we'd have to pick a modulus that's not a power of 2, but then that would be computationally more expensive, not to mention some architectures don't have hardware division.)

I think we should stop using rand().

Yes, I was coming to the same conclusion. I think for the benchmark program using a proper CSDRBG would make sense. We could use the PSA RNG if PSA is available, else CTR-DRBG else HMAC-DRBG.

mpg avatar Oct 06 '25 07:10 mpg

( state * 1103515245 + 12345 )

This code (from an archived repository) shows horrible choice of parameters. With proper numbers the internal state makes full circle going through all possible bit combinations without repetitions.

*output = (unsigned char) rand();

Higher order bits should be taken in case of simple linear generators; like (unsigned char) (rand() >> 16).

However, Visual C rand() already hides low bits as it returns 16-bit number while srand() uses 32 bits.

irwir avatar Oct 06 '25 13:10 irwir

@irwir: The LCG was just intended as a demo. We're all aware that this kind of pseudo-random numbers have limitations. Obviously, implementations like this one can be found in real world applications/runtimes.

@mpg: A quite simple solution based on @irwir's suggestion would be:

static int myrand(void *rng_state, unsigned char *output, size_t len)
{
    (void) rng_state;
    while (len > 0) {
        #if (RAND_MAX >= 0x10000)
        *output = (unsigned char) (rand() >> 16);
        #else
        *output = (unsigned char) rand() ; /* e. g. Visual C */
        #endif
        output += 1;
        len -= 1;
    }
    return 0;
}

Now the benchmark program doesn't loop anymore.

Btw: There are several functions using the same approach. Please update them accordingly.

  • framework/test/src/random.c: mbedtls_test_rnd_std_rand()
  • library/rsa.c: myrand() in library/rsa.c
  • programs/fuzz/common.c: dummy_random() and dummy_entropy()
  • programs/ssl/ssl_test_lib.c: dummy_entropy()
  • tests/src/test_helpers/ssl_helpers.c: mbedtls_test_random()

jbsgh avatar Oct 07 '25 07:10 jbsgh

BTW: An improved test would be #if (RAND_MAX >= 0x00FFFFFF). This condition ensures that the value of rand() shifted right by 16 bits covers the full range of an unsigned char.

jbsgh avatar Oct 08 '25 06:10 jbsgh

Sure, but at some point it's better to just stop using rand() (which comes with almost no guarantees from the standard) and move to something better, as we were planning to do anyway for other reasons.

That said, if you want to make a PR with a hotfix (that is, minimal changes, still using rand() but in a smarter way) for the cases that are problematic for you, I'd be willing to review it.

But I don't think we are going do to that ourselves - we're more likely to wait until we have the time to fully move away from rand(). (Just trying to set realistic expectations: your request is perfectly reasonable, but unfortunately we have more reasonable things to do than time to do it.)

mpg avatar Oct 23 '25 10:10 mpg

https://github.com/Mbed-TLS/mbedtls/pull/10492

jbsgh avatar Nov 03 '25 09:11 jbsgh