thrust::uniform_int_distribution<uint64_t> exclusively produces multiples of 4096
The thrust::uniform_int_distribution uses uniform_real_distribution under the hood. However, for 64 bit numbers this means the lower 12 bits are always 0.
For example:
int main( int argc, char** argv )
{
thrust::uniform_int_distribution<uint64_t> dist;
// thrust::default_random_engine engine{ 0 };
// thrust::taus88 engine{ 0xdeadbeef };
thrust::ranlux48 engine{ 0 };
for( uint64_t i = 0; i < 1e4; i++ )
std::cout << std::hex << dist( engine ) << std::endl;
}
will produce 0s in the lower bits for all the thrust random number generators.
Good catch, thanks for reporting this.
Sounds like a casting issue somewhere in the implemenation-- a 64-bit float has 52 bits of integer precision, leaving 12 bits for sign / exponent information, so that sounds relevent.
For the record the code above above produces 16bits/4 zeros (not 12bits, 3 zeros) at the end because the 48bit engine (ranlux48). Even if the bug wasn't there the 4th to last digit would still be 0. Output would be in that form:
...
92bbe2033d240495
ab0fdb800d8d0c8f
b081c91470c903c1
b6432b47d7c7078f
31b262cf5c1700ce
...
@djns99 though is still right that thrust::uniform_int_distribution is still bogus due to its reliance on uniform_real_distribution. Exactly as Allison explained it's because of the mantissa being only 52 bits:
64bit integer input: 64-52 = 12 bits lost in conversion
When we convert back to integer (shift left 12 bits) after the random distr computation, these 12 bits are 0s.
For the record the code above above produces 16bits/4 zeros (not 12bits, 3 zeros) at the end because the 48bit engine (ranlux48). Even if the bug wasn't there the 4th to last digit would still be 0.
This seems like there is also a corresponding bug in the uniform_real_distribution I guess this comes down to a question of what thrust's goals are. If we are trying to provide robust random utilities, then this needs fixed as well.
Some statistical methods can be quite sensitive to systematic biases like these, and generally the point of using the library types is to get good distribution without needing to research it yourself.
I vote to refactor both the uniform_real_distribution and uniform_int_distribution methods to match libstdc++ perhaps.
e.g. this std::generate_canonical function appears to be what's missing for real distribution.
For uniform_int_distribution we should be able to use the libstdc++ algorithm. I believe this is standard as we have seen it in a couple of places