oml
oml copied to clipboard
Evaluate Random against Dieharder
The source code claims that the sequence of ints generated passes the famous Diehard tests of randomness.
But does it Dieharder?
so far I have:
| RNG | PASSED | FAILED | WEAK |
|---|---|---|---|
| URandom | 60 | 1 | 3 |
| OCaml | 63 | 1 | 0 |
| OCaml-50%-bias | 19 | 45 | 0 |
Where this is the OCaml test:
let () =
Random.self_init ();
while true do
print_char Random.(int 256 |> char_of_int);
flush stdout;
done
50% bias means print_char Random.(int 128 |> char_of_int); (for validation purposes),
and /dev/urandom is on:
$ uname -srpi
Linux 2.6.32-358.el6.x86_64 x86_64 x86_64
$ cat /etc/lsb-release
LSB_VERSION=base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
$ cat /etc/centos-release
CentOS release 6.4 (Final)
OCaml and urandom fail the same test: -d 201 a.k.a. rgb_minimum_distance:
$ dieharder -d 201 -h
#
# THE GENERALIZED MINIMUM DISTANCE TEST
#
# This is the generalized minimum distance test, based on the paper of M.
# Fischler in the doc directory and private communications. This test
# utilizes correction terms that are essential in order for the test not
# to fail for large numbers of trials. It replaces both
# diehard_2dsphere.c and diehard_3dsphere.c, and generalizes the test
# itself so that it can be run for any d = 2,3,4,5. There is no
# fundamental obstacle to running it for d = 1 or d>5, but one would need
# to compute the expected overlap integrals (q) for the overlapping
# d-spheres in the higher dimensions. Note that in this test there is no
# real need to stick to the parameters of Marsaglia. The test by its
# nature has three controls: n (the number of points used to sample the
# minimum distance) which determines the granularity of the test -- the
# approximate length scale probed for an excess of density; p, the usual
# number of trials; and d the dimension. As Fischler points out, to
# actually resolve problems with a generator that had areas 20% off the
# expected density (consistently) in d = 2, n = 8000 (Marsaglia's
# parameters) would require around 2500 trials, where p = 100 (the old
# test default) would resolve only consistent deviations of around 1.5
# times the expected density. By making both of these user selectable
# parameters, dieharder should be able to test a generator pretty much
# as thoroughly as one likes subject to the generous constraints
# associated with the eventual need for still higher order corrections
# as n and p are made large enough.
#