java-random icon indicating copy to clipboard operation
java-random copied to clipboard

Values seem to diverge over iterations

Open Wrexial opened this issue 1 year ago • 2 comments

Heya, been trying to narrow down a bug in a tool and I've narrowed down the issue to be from this package.

In essense i can reduce the issue to something simple:

image

Having a constant start seed (42) and a constant int bounds (14453), the first 164,455 values the random generator would give us the correct matching values with 164,455 having the value of 7947 on both java and python. Than at 164,456 in java it gives us the value 723 whereas in python we'd get a value of 9069. Interestingly, at iteration 164,457 in python we'd get the 723 that java gave us beforehand.

Would you have any idea what's happening? I tried to debug it but I'm not sure why the next function is returning different values. I did notice Java uses AtomicLong and I've tried to use atomiclong but it didnt change the values returned so I don't think that's the cause.

EDIT: ohh yeah, using latest JDK and latest python version if it matters

Wrexial avatar Mar 14 '23 19:03 Wrexial

I don't know offhand. Likely one of the intermediate values needs to be masked like a short, given an & 0xffff at some point. I would encourage you to read the original Java source code, as that's how I originally implemented it.

MostAwesomeDude avatar Mar 15 '23 15:03 MostAwesomeDude

Cheers, that helped immensly. Narrowed it down and it ended up being the nextInt function having an underflow in java which doesn't underflow in python.

Theres probably a cleaner solution, but the following worked for us with around 400k iterations it didn't diverge.

def nextInt(self, n = None):
        """
        Return a random int in [0, `n`).

        If `n` is not supplied, a random 32-bit integer will be returned.
        """

        if n is None:
            return self.next(32)

        if n <= 0:
            raise ValueError("Argument must be positive!")

        # This tricky chunk of code comes straight from the Java spec. In
        # essence, the algorithm tends to have much better entropy in the
        # higher bits of the seed, so this little bundle of joy is used to try
        # to reject values which would be obviously biased. We do have an easy
        # out for power-of-two n, in which case we can call next directly.

        # Is this a power of two?
        if not (n & (n - 1)):
            return (n * self.next(31)) >> 31

        bits = self.next(31)
        val = bits % n
        t = bits - val + n - 1
        while (t < 0 or t >= 0x7FFFFFFF):
            bits = self.next(31)
            val = bits % n
            t = bits - val + n - 1

        return val

EDIT: tried opening a PR but this repo is private, but its an easy change anyways.

Wrexial avatar Mar 16 '23 18:03 Wrexial