tempfile icon indicating copy to clipboard operation
tempfile copied to clipboard

Use of predictable RNG

Open vks opened this issue 3 years ago • 13 comments

This library recently switched to non-CSPRNGs (see #141 and #162), which are inherently predictable based on observations of their output. Doesn't this make the library vulnerable to attacks listed here, because the generated filenames can be predicted?

vks avatar Apr 30 '22 16:04 vks

This temporary file library is equivalent to mkstemp. The relevant part of the document is:

However, mkstemp() still suffers from the use of predictable file names and can leave an application vulnerable to denial of service attacks if an attacker causes mkstemp() to fail by predicting and pre-creating the filenames to be used.

This is one of those cases where you need to understand the threat-model. There are many theoretical attacks that simply aren't worth the effort to guard against:

  1. It relies on being able to observer the creation of many temporary files (to be able to predict future ones).
  2. It relies on being able to create arbitrarily named temporary files in the same directory as the target application. If you can do this, you can likely exhaust inodes, use up all disk space, etc.
  3. It is, at worst, a DoS vector. It can't be used to actually trick the target application into doing anything.

Basically, if an attacker is in a place where they can exploit this issue, you likely have bigger issues.

Also note, python, golang, and C all behave the same way (use non-cryptographic randomness for performance).


If you want to improve the situation, I'd be happy to accept a patch that re-seeds the random number generator with system randomness if it fails to create a temporary file after some number of tries.

edited: My original comment made statements about security "experts" which could have been taken as a comment about security researchers in general. This was not my intention, nor was it relevant.

Stebalien avatar May 02 '22 14:05 Stebalien

I disagree with 1., "many" can be three, depending on the RNG.

I agree that 2. is a problem anyway, and that ideally the temporary files should not share a namespace with other applications, but temporary files are often used like this.

Non-cryptographic randomness is not a performance advantage, CSPRNGs have comparable speed, because they are optimized for SIMD while the non-CSPRNGs are not. Non-CSPRNGs are simpler, not necessarily faster.

The scenario you are mentioning (DoS) is the least concerning. I would be much more concerned about an attacker manipulating file permissions and content, or escalating privileges. This can be avoided with flags like O_CREAT and O_EXCL (and more recently O_TMPFILE), but as far as I know not all operating systems (and file systems) support them.

What is the motivation for not using a CSPRNG? Compilation time and code complexity?

vks avatar May 02 '22 22:05 vks

The scenario you are mentioning (DoS) is the least concerning. I would be much more concerned about an attacker manipulating file permissions and content, or escalating privileges. This can be avoided with flags like O_CREAT and O_EXCL (and more recently O_TMPFILE), but as far as I know not all operating systems (and file systems) support them.

This is the main reason this crate exists. It handles all the platform quirks around O_CREAT and O_EXCL, and does not depend on randomness for security. This library will never re-use an existing file when creating a temporary file (and will use O_EXCL where possible). In practice:

  1. On Macos and Windows, the temporary file directory is always per-user.
  2. On Linux, O_TMPFILE will be used when creating unnamed temporary files.

The only thing affected would be named temporary files/directories on Linux where the user is using the shared /tmp directory directly.

What is the motivation for not using a CSPRNG? Compilation time and code complexity?

Complexity and dependency weight. Performance isn't really an issue given that the syscall cost to create a file will likely dominate in most cases.

Stebalien avatar May 03 '22 14:05 Stebalien

Can this crate be used in production where the filenames must be unpredictable ?

If the crate does not guarantee predictability then it perhaps should simply be documented as such.

This crate is advertising itself as:

A secure, cross-platform, temporary file library for Rust.

Thanks

pinkforest avatar Aug 14 '22 07:08 pinkforest

(got sent here from the advisory-db link)

If you want to improve the situation, I'd be happy to accept a patch that re-seeds the random number generator with system randomness if it fails to create a temporary file after some number of tries.

There's no method in the std to get system randomness, would a dependency on getrandom be acceptable here? (I suspect a fair few crates already have it in their dependency tree). Failing that, the windows/mac/linux specific bits of getrandom could be vendored.

In my testing, on linux at least, getrandom is fast enough to the point where it's reasonable to use it directly for entropy (a few million [u8; 32]'s per second), so the whole RNG could maybe be removed. Re-seeding / only using entropy directly as a fallback would also be fine.

Especially since you're doing a syscall and probably file IO afterwards, the perf of getting entropy is likely fine.

5225225 avatar Aug 14 '22 11:08 5225225

Can this crate be used in production where the filenames must be unpredictable ?

Yes, it absolutely can. Same as python and go which use the same randomness function for temporary files.

Worst-case, there's a DoS vector. But if you have some untrusted application that can create many files with chosen names in a shared temporary directory of temporary files, you this is the least of your concerns.

Security is all about understanding your threat model and making trade-offs (performance, complexity, etc.).

There's no method in the std to get system randomness, would a dependency on getrandom be acceptable here? (I suspect a fair few crates already have it in their dependency tree). Failing that, the windows/mac/linux specific bits of getrandom could be vendored.

A call to getrandom would be fine given, as you say, we're already performing a syscall. I'm happy to accept a patch to do so (but it's not really a priority).

Stebalien avatar Aug 14 '22 20:08 Stebalien

Cool thanks for clarifying -

We are pretty practical at RustSec and we have recommended your crate over others -

We recently had to file advisory on temporary and previously tempdir and your crate popped up so I wanted to clarify this.

https://rustsec.org/advisories/RUSTSEC-2018-0022.html https://rustsec.org/advisories/RUSTSEC-2018-0017.html

Thanks for working on a crate that focuses on security and portability :unicorn:

pinkforest avatar Aug 14 '22 20:08 pinkforest

Using getrandom instead is possible, but this will require mapping &[u8] to alphanumeric strings, which requires vendoring some kind of integer sampling.

vks avatar Aug 16 '22 16:08 vks

Using getrandom instead is possible, but this will require mapping &[u8] to alphanumeric strings, which requires vendoring some kind of integer sampling.

  1. Define a dictionary (A-Za-z0-9).
  2. Take each byte of input modulo the length of the dictionary and look it up in the dictionary.

Stebalien avatar Aug 16 '22 17:08 Stebalien

The best solution is probably to read, maybe, 1k of bytes into a thread-local buffer, then draw randomness from that until we run out.

Stebalien avatar Aug 16 '22 17:08 Stebalien

Take each byte of input modulo the length of the dictionary and look it up in the dictionary.

Unfortunately, that method is slightly biased and does not result in a uniform alphanumeric distribution.

vks avatar Aug 16 '22 23:08 vks

Unfortunately, that method is slightly biased and does not result in a uniform alphanumeric distribution.

Hm, that slightly biases the first 8 characters.

Stebalien avatar Aug 17 '22 00:08 Stebalien

Modulo bias would be arguably Fine here since we only need a value that can't be perfectly predicted consistently, not a value that can be predicted even slightly better than average.

5225225 avatar Aug 17 '22 07:08 5225225