Remove bias when generating secrets
Hi, thanks so much for this project! It's really helping me out. I looked over the code and found something that might be improved.
The way secrets were generated not all outcomes were equally likely. Here's wat happened:
- A number of random bytes were generated.
- Each byte was mapped to one of 83 "ascii" values (or 61 without symbols).
- (Somehow the lowercase letter "j" was left out, or it would have been 84 and 62)
- Mapping was done with floor(byte-value / 255 * (set-length - 1)), this caused heavy bias against the last item in the set. It should be floor(byte-value / 256 * set-length). Even then there's some bias in the mapping, because the bits just don't fit, set indices 0, 21, 42, and 63 are 33.3% more likely to occur.
- That string was then encoded as hex or base32, but since only 83 (or 61) byte values were possible for each position in the source string, out of every 10 possible hex/base32 strings, only about three would be generated. That's a pretty heavy bias.
So now I do this:
- Generate a number of random bytes,
- For the hex and base32 outputs, just encode those bytes as hex and base32.
- For the ascii output, use the (slightly improved) mapping function.
- Now hex and base32 are purely unbiased, and ascii still has some bias in it, so I think it should be deprecated, or moved to base64.
- One caveat, you can no longer save the ascii output on one occassion, translate it to base32 or hex yourself, and reuse it at a later occassion. Because now the base32 and hex outputs just have more random bits stuffed in them, which are discarded for the ascii output.
I hope this is all clear! Thanks in advance for looking at this.
Coverage remained the same at 100.0% when pulling 9aee69d865eae942edcc4c39d4b42e7271809c6d on pepve:unbiased-secrets into e63b936d562fa16c5970d0821a221cf1a59f559b on speakeasyjs:master.
@markbao How likely will this change break downstream code?
I can't speak to closed source uses of Speakeasy, but I could only find one usage with possibly breaking code, in an unforked, unstarred, unwatched repository: https://github.com/pefish/common-express/blob/f3e44569b8eeb6430b7c7856a71ef213050056e4/utils/GoogleAuthenticator.js. I used this search: https://github.com/search?utf8=%E2%9C%93&q=speakeasy+generateSecret+ascii&type=Code.
Any reason this isn't merged, yet?
@Panoplos perhaps this code is no longer maintained? @markbao
on an unrelated note, I stumbled across this myself earlier today and was super upset. Good job @pepve finding and fixing this!
@pepve FYI I merged this code into https://github.com/mlogan/squeakeasy and published the fork as a new npm module called squeakeasy, since this one appears to be truly abandoned.