Seed random number generation for query language rng functions
At the moment, there is not a good way to seed query language functions like randomGaussian() or randomInt() in a way that's thread-safe and efficient. This may take some real investigation.
A seeded PRNG is by its nature stateful; thus, there is no way to have a static function do what you want w/ a seeded PRNG (with the intention being some sort of reproducible PRNG associated with that specific query). If you need a seeded PRNG, the user will need some way to create that state and then reference it from the query.
An approach that might satisfy your needs, but isn't a PRNG, is some sort of mixing or hashing function based on a state (which we can assume to be the internals of the PRNG):
table.view(["X = mix_or_hash(my_seed ^ ii)"])
Depending on the quality of the hash, this may be a reasonable proxy for a PRNG. Utilities based on this hash could be built out:
table.view(["X = stateRandomInt(my_seed ^ ii, 0, 5)", "Y = stateRandomGuassian(my_seed ^ ii)"])
And https://en.wikipedia.org/wiki/Linear_congruential_generator might be fast and good enough for this use case. Or https://en.wikipedia.org/wiki/Permuted_congruential_generator.
After looking at this and thinking about it for a while, I'm going to delay doing anything.
- The current random number generation uses
ThreadLocalRandomto get high efficiency random number generation on multiple threads. Moving away fromThreadLocalRandomas the default seems like a bad choice. - Pseudo-random number generators start with a seed and then create a sequence of numbers that obey certain statistical properties. Generating a sequence of random number generators with a sequence of seeds and then taking one number from each generator does not produce a sequence of random numbers with the same properties. It is possible that the numbers superficially look the same, but there is not a guarantee that numbers created this way will have the correct properties.
As a result, I'm going to delay action until we get more specific user feedback on what is desirable.
- We could let users create their own RNG for special cases, such as seeding.
- We could create a single threaded RNG that is less performant but supports a seed.
Either way, it is prudent to wait for more user specifications before moving forward.