deno_std icon indicating copy to clipboard operation
deno_std copied to clipboard

`randomSeeded` only gives 32 bits of entropy (whereas `Math.random` gives 53 bits)

Open lionel-rowe opened this issue 7 months ago • 1 comments

I'm not sure how much this matters for typical use cases, but I think it's worth raising before @std/random is stabilized.

randomSeeded uses a single generated u32 from pcg32, then converts it to a float64 by dividing by u32 max + 1.

For most cases, this is probably fine, but it does mean that the float64's mantissa isn't uniformly entropic, so a consumer expecting a given range of bits to contain entropy might find that expectation disappointed.

This can be observed as follows:

import { randomSeeded } from 'jsr:@std/random'
import { patch, inspectBytes } from 'jsr:@li/custom-inspects'

patch(Uint8Array.prototype, inspectBytes)

const numFloats = 100
const seed = crypto.getRandomValues(new BigUint64Array(1))[0]!
const { BYTES_PER_ELEMENT } = Float64Array

for (const [name, fn] of [
    ['Math.random', Math.random],
    ['randomSeeded', randomSeeded(seed)],
] as const) {
    const b = new Uint8Array(numFloats * BYTES_PER_ELEMENT)
    const dv = new DataView(b.buffer)
    for (let i = 0; i < numFloats; ++i) dv.setFloat64(i * BYTES_PER_ELEMENT, fn(), true)

    console.log(`=== ${name} ===`)
    console.log(b)
    console.log()
}

Example output:

=== Math.random ===
Uint8Array(800) [
  ## x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xa xb xc xd xe xf
  0x 73 68 c0 78 57 1d e7 3f 54 ba 69 84 a0 87 e4 3f sh.xW..?T.i....?
  1x fc 81 bd 2d 18 b5 c8 3f a8 8c 88 e3 99 91 d4 3f ...-...?.......?
  2x 68 80 b9 fe 4c 39 e3 3f 72 19 52 df 15 4a dd 3f h...L9.?r.R..J.?
  3x 24 b0 9a ce 10 d1 ea 3f 20 cd 5c 2f 65 2c 99 3f $......? .\/e,.?
  4x 71 fb 6e 53 03 af e3 3f 29 bb 1e 59 34 e9 ef 3f q.nS...?)..Y4..?
  5x 07 ca 41 2d d4 e6 ea 3f ae fe 56 a2 9d 24 e4 3f ..A-...?..V..$.?
  6x ba 6c 70 8b ab 3c d8 3f 47 21 24 da d4 28 e5 3f .lp..<.?G!$..(.?
  ... 800 B total
]

=== randomSeeded ===
Uint8Array(800) [
  ## x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xa xb xc xd xe xf
  0x 00 00 00 06 14 5c d3 3f 00 00 80 36 aa 30 c0 3f .....\.?...6.0.?
  1x 00 00 00 6a 4b b4 b0 3f 00 00 60 21 66 d2 e6 3f ...jK..?..`!f..?
  2x 00 00 c0 8a f9 f6 ea 3f 00 00 40 f5 d6 76 de 3f [email protected].?
  3x 00 00 c0 8d 2f 0d d8 3f 00 00 60 4c 15 1e e7 3f ..../..?..`L...?
  4x 00 00 40 c9 1c 8c d6 3f 00 00 00 e9 e9 8b ea 3f ..@....?.......?
  5x 00 00 c0 fb 2d f3 d3 3f 00 00 c0 38 b6 ff df 3f ....-..?...8...?
  6x 00 00 00 36 d5 41 df 3f 00 00 a0 f2 2a 92 eb 3f ...6.A.?....*..?
  ... 800 B total
]

You can see from the repeated 3f bytes that Math.random isn't uniformly entropic either (as the exponent and sign bit don't change, i.e. only the 53-bit mantissa contains entropy), but it doesn't contain repeated 00 bytes like randomSeeded does.

lionel-rowe avatar Apr 18 '25 03:04 lionel-rowe

Relevant:

https://github.com/tc39/proposal-seeded-random/issues/20

Given these tradeoffs, the "divide by 2^53" approach seems best, even though this means it will actually never emit many of the floats in [0, 1) - that's probably not a problem in practice.

With pcg32, that approach could look something like this:

const MASK = 2 ** (53 - 32) - 1
const CEIL = 2 ** 53 // == Number.MAX_SAFE_INTEGER + 1

function randomSeeded53Bit(seed: bigint) {
    const pcg = fromSeed(seedFromU64(seed, 16))
    const dv = new DataView(new ArrayBuffer(8))
    return () => {        
        dv.setUint32(0, nextU32(pcg) & MASK)
        dv.setUint32(4, nextU32(pcg))

        const n53 = Number(dv.getBigUint64(0))
        // assert(n53 >= 0 && n53 <= Number.MAX_SAFE_INTEGER)
        const val = n53 / CEIL
        // assert(val >= 0 && val < 1)
        return val
    }
}

lionel-rowe avatar Apr 18 '25 04:04 lionel-rowe

fixed by #6626

kt3k avatar Jun 27 '25 04:06 kt3k

fixed by #6626

@kt3k It's not strictly fixed by that, as randomSeeded still has this problem. It's just a question of whether it's better to leave as-is or change it before a stable version of @std/random is released.

lionel-rowe avatar Jun 27 '25 05:06 lionel-rowe

ah, ok. re-opening

It's just a question of whether it's better to leave as-is or change it before a stable version of @std/random is released.

Is the downside perf degradation?

kt3k avatar Jun 27 '25 06:06 kt3k

Is the downside perf degradation?

I haven't tested yet, my guess is approx. twice as slow, as it requires generating two u32s per float. Probably the main downside is that it's breaking, but maybe that's not too important pre-stable release.

lionel-rowe avatar Jun 27 '25 07:06 lionel-rowe