tempfile icon indicating copy to clipboard operation
tempfile copied to clipboard

Replace `fastrand` with `getrandom` and `base64`

Open vks opened this issue 1 year ago • 8 comments

  • Instead of setting up a predictable userspace RNG, we get unpredictable random bytes directly from the OS. This fixes #178.
  • To obtain a uniformly distributed alphanumeric string, we convert the random bytes to base64 and throw away any letters we don't want (+ and /). With a low probability, this may result in obtaining too few alphanumeric letters, in which case we request more randomness from the OS until we have enough.
  • Because we cannot control the seed anymore, a test manufacturing collisions by setting the same seed for several threads had to be removed.

vks avatar Aug 16 '22 23:08 vks

Wouldn't using the URL-safe Base64 alphabet avoid the issues with + and /? (using - and _ instead, which are both valid in filenames)

tarcieri avatar Aug 16 '22 23:08 tarcieri

  • I'd prefer not to introduce a dependency on base64. We can achieve the same result by taking each byte mod 64, throwing out 62 and 63, then mapping it to an alphabet. That will give us uniform randomness.
  • We should just read more randomness than we need (e.g., 256 bytes) and store it in thread-local storage.
  • Looking into getrandom, I'm concerned about blocking on early boot (per the getrandom documentation). We'd be trading a potential and unlikely DoS, for a deadlock on some platforms (e.g., an init system built in rust).
  • We should absolutely not panic (e.g., unwrap the result from getrandom), under any circumstances.

Stebalien avatar Aug 17 '22 00:08 Stebalien

Hm. It also looks like we have a gap in our CI (https://github.com/Stebalien/tempfile/pull/189). We should at least build on WASM (without WASI), even if we can't actually create temporary files.

Stebalien avatar Aug 17 '22 00:08 Stebalien

Looking into getrandom, I'm concerned about blocking on early boot (per the getrandom documentation). We'd be trading a potential and unlikely DoS, for a deadlock on some platforms (e.g., an init system built in rust).

I guess the question here is: are we worse off? I haven't taken a look at rand or fastrand, they may have the same early-boot issue.

I'd prefer not to introduce a dependency on base64.

Basically, people tend to look at dependencies with skepticism. It's a small dependency, but it's still yet another thing to audit (and having a temporary file library depend on it is a bit strange).

Stebalien avatar Aug 17 '22 00:08 Stebalien

I haven't taken a look at rand or fastrand, they may have the same early-boot issue.

Fastrand seeds its RNG initially with the thread id and Instant::now. It does not use any other forms of entropy.

Stracing rand generating a i32 with just rand::generate::<i32>(), it calls

getrandom(NULL, 0, GRND_NONBLOCK)     = 0
getrandom("\x73[...]\x41\xb1", 32, 0) = 32

I can't quite figure out where the call to NONBLOCK is coming from. Checking the docs, rand does seem problematic to use in early boot.

It is possible that when used during early boot the first call to OsRng will block until the system’s RNG is initialised. It is also possible (though highly unlikely) for OsRng to fail on some platforms, most likely due to system mis-configuration.

(https://docs.rs/rand/0.8.5/rand/rngs/struct.OsRng.html)

And OsRng is what rand's default RNG uses (as a base source of entropy)

In any case, if we can't get entropy at all, or if getrandom fails, then we'd need to fallback to... Current time and thread id, I suppose? So, keep on using fastrand, or roll our own generation.

5225225 avatar Aug 17 '22 08:08 5225225

@Stebalien Would you be ok with a base64 alphabet (e.g. by adding - and _) to simplify the implementation?

I can get rid of the base64 dependency, but this would essentially require vendoring a base64 encoder (or using the random bytes inefficiently).

vks avatar Aug 17 '22 17:08 vks

Looking into getrandom, I'm concerned about blocking on early boot (per the getrandom documentation). We'd be trading a potential and unlikely DoS, for a deadlock on some platforms (e.g., an init system built in rust).

Couldn't such a special application generate its own random file name?

vks avatar Aug 17 '22 17:08 vks

Would you be ok with a base64 alphabet (e.g. by adding - and _) to simplify the implementation?

I want to stick to alpha-numeric (least surprise).

Couldn't such a special application generate its own random file name?

Not really. I really don't want to limit the use-cases of this library.


But this is getting significantly more complicated than I had anticipated. It feels like we're basically re-inventing the rand crate.

I have limited time for the next few weeks, but I'm going to think about this a bit and come back to you.

Stebalien avatar Aug 17 '22 18:08 Stebalien