rust_rdrand icon indicating copy to clipboard operation
rust_rdrand copied to clipboard

Output is (too) biased

Open briansmith opened this issue 2 years ago • 4 comments

Please see https://github.com/rust-random/getrandom/issues/228. I'd rather not going to repeat all the details here because I'd like to keep the conversation in one place. However, I also want you to be aware of this concern. Maybe I'm overlooking something.

briansmith avatar Oct 01 '21 00:10 briansmith

Yeah, this is unfortunate. My memory of this issue is that there's no good way to use CPU detection to filter out the CPUs that do the wrong thing (e.g. there were some bulldozers marketed as Ryzen 1000 series), but nowadays, I believe just considering anything before zen2 be broken would cover everything.

For systemd the x != 0 as a solution makes most sense mostly because it cares more about not blocking the boot process than it is about bias in the randomness. Everything else just copied that.

nagisa avatar Oct 01 '21 09:10 nagisa

Fixed in https://github.com/nagisa/rust_rdrand/commit/5ef19218ce5bd216fbb444f36624f9c6e192c2e1 I believe.

nagisa avatar Oct 03 '21 15:10 nagisa

Some comments on the fix:

  • The comment says "On AMD processor families < 0x17." I don't have enough direct information on the matter but I think BoringSSL excludes some 0x17 CPUs, as I noted in https://github.com/rust-random/getrandom/issues/228#issue-1012707302. So I'm not sure this is conservative enough. See also https://github.com/rust-random/getrandom/issues/228#issuecomment-932084202 where it was noted that other AMD CPUs may have issues (apparently, depending on the motherboard firmware).
  • There is some duplicate code that seems unfortunate, e.g. the duplicate blocks for filtering out bad AMD processors. This makes the change harder to read than if the duplicate logic were factored out in some way.
  • Since the fix wasn't done using the GitHub PR workflow, it's difficult to give line-by-line commentary on the changes. (I used to make changes in a similar way and eventually I switched to the GitHub PR workflow for even my own changes, and I think it actually improved my workflow, although it seemed like a waste of time when I started.)

briansmith avatar Oct 04 '21 17:10 briansmith

The comment says "On AMD processor families < 0x17." I don't have enough direct information on the matter but I think BoringSSL excludes some 0x17 CPUs

As my previous comment indicates, I remember some Ryzen CPUs being affected, but those were not actually Zen, but rather an older family (16h). 17h model 0x7?s are Matisse which led me to recall https://www.phoronix.com/scan.php?page=news_item&px=AMD-Releases-Linux-Zen2-Fix and then a much more widely covered “Destiny 2 does not work” problem.

I do understand why BoringSSL would decide to filter out these CPUs given this information, but its pretty disappointing that there is no more conclusive information. e.g. I'm not entirely sure if it wasn't just rdrand always reporting a failure to generate a random number (and the game/systemd trying to infinitely generate one regardless). At the very least a lack of a linux kernel workaround for this problem would indicate to me that it isn't serious at all… Not quite sure what the best way forward here is.


My reading of the other systemd issue (https://github.com/systemd/systemd/issues/18184) is that the issue is not actually there except for a motherboard with a beta release of firmware for it. I'm not sure how feasible it is for us to meaningfully safeguard against users using non-production firmware for security sensitive workloads. After all, anybody can pick open a microcode blob and do something weird to the rdrand related code therein if they really wanted to.


There is some duplicate code that seems unfortunate, e.g. the duplicate blocks for filtering out bad AMD processors. This makes the change harder to read than if the duplicate logic were factored out in some way.

I'll think about a better approach here.


Since the fix wasn't done using the GitHub PR workflow, it's difficult to give line-by-line commentary on the changes. (I used to make changes in a similar way and eventually I switched to the GitHub PR workflow for even my own changes, and I think it actually improved my workflow, although it seemed like a waste of time when I started.)

Point taken.

nagisa avatar Oct 04 '21 18:10 nagisa