<random>: uniform_real_distribution(a, b) produces outputs outside [a, b)
Describe the bug
Produces random floating-point values i, uniformly distributed on the interval [a, b)
The below code sample demonstrates how a default-constructed uniform_real_distribution<float> (a=0, b=1) generates a 1.0f if provided a random number with all bits set:
Command-line test case
#include <cassert>
#include <cstdint>
#include <limits>
#include <random>
struct max_rng
{
using result_type = std::uint32_t;
result_type min() const { return 0; }
result_type max() const { return std::numeric_limits<result_type>::max(); }
result_type operator()() { return max(); }
};
int main()
{
max_rng rng;
std::uniform_real_distribution<float> iid;
assert(iid(rng) < 1.0f);
return 0;
}
Expected behavior
Regardless of the input received from the RNG, uniform_real_distribution should never produce b as the interval is supposed to be right-exclusive. GCC passes this test.
STL version Microsoft Visual Studio Community 2019 Version 16.6.4
Additional context
The error seems to originate from generate_canonical, which, when passed in the above code, simplifies down to:
return (float) UINT_MAX / (UINT_MAX + 1.0f);
which, mathematically speaking, should be less than 1, but in practice gets rounded to 1.0f due to limited floating point precision. generate_canonical is flawed in other regards: using a normalizing/scaling division leads to non-uniformly distributed values. Both of these issues could be avoided by generating random floats the "canonical" way, which is to set the bits of the mantissa directly (and simply throw away left-over entropy) along the lines of:
// @rnd: uniformly distributed 32bit unsigned int.
return (rnd >> 8) / 164233216.0f
which not only produces values strictly from [0, 1) for any value of rnd, but also makes sure that all values are equally far spaced apart leading to a perfect uniform distribution.
Also tracked by DevCom-110322 and Microsoft-internal VSO-253526 / AB#253526 .
@BillyONeal commented:
As an FYI, this is a bug in the standard; there's currently a paper pending to fix it: WG21-P0952
@statementreply commented:
his paper proposes a new specification of generate_canonical, but it doesn't appear to me that this paper would address the issue of uniform_real_distribution specification.
Additional context VSO-253526 | DevCom-110322
@fsb4000 Thanks for letting me know. So, currently, by definition we are forced to break the standard because it is inconsistent. We can either
- Follow the dictated implementation, breaking the math. constraints, or
- Conform to the math. constraints but deviate from the dictated implementation.
Would it make sense to opt for the 2nd option? Personally, I'd prefer a math. sound implementation over anything else, seeing as we aren't gonna be standard-conform either way.
@denniskb Yes, I think the second option is good. But I'm just a random person on the Internet, let's wait for an opinion of the Microsoft employees :)
This is related, although not mandatorily, to #4169 (WG21-P0952R2 and LWG-2524).