xtensor icon indicating copy to clipboard operation
xtensor copied to clipboard

Numpy and xtensor random engine does not match

Open james-mxr opened this issue 1 year ago • 5 comments

Running a list of random numbers with the same seed yields different results. Python:

np.random.seed(1)
print (np.random.randint(0, 1000, size=100))

Python results

[ 37 235 908  72 767 905 715 645 847 960 144 129 972 583 749 508 390 281
 178 276 254 357 914 468 907 252 490 668 925 398 562 580 215 983 753 503
 478 864  86 141 393   7 319 829 534 313 513 896 316 209 264 728 653 627
 431 633 456 542  71 387 454 917 561 313 515 964 792 497  43 588  26 820
 336 621 883 297 466  15  64 196  25 367 738 471 903 282 665 616  22 777
 707 999 126 279 381 356 155 933 313 595]

Cpp:

xt::random::seed(1);
std::cout << xt::random::randint({100}, 0, 1000) << std::endl;

Cpp results:

{845, 139, 124, 368, 263, 313, 491, 341, 759, 432, 248, 249, 516, 943,  13,
 340, 302, 721, 242, 716, 750, 493, 346, 204, 723, 676, 674, 652, 829, 254,
  18, 996, 487, 703, 537, 273, 303, 166, 392, 310, 725, 929, 785, 223, 623,
 445, 998, 249, 721, 664, 364, 865,  88, 320,  21, 723, 111, 594, 337,   8,
 446, 999, 459, 358, 213, 473, 481, 627, 572, 848, 329, 323,  20,  42, 972,
 848, 565, 659, 289, 370,  31, 832,  12, 545, 775,  90, 919, 671, 346, 729,
  40, 118, 977, 707, 143, 342, 687, 677, 452, 187}

james-mxr avatar Jul 19 '22 11:07 james-mxr

This is inherit: xtensor uses the C++ default random number generator and NumPy does not. The two should therefore not match up. Things are even more worrisome, as the sequence will likely be different on different platforms / compilers. I was stuck with this problem too, so I have created a simple random number generator with Python bindings to be able to draw the random sequence in C++ and Python on any platform with any compiler : https://github.com/tdegeus/prrng . It does not have randint for the moment, but that should be 2 seconds work. If you are interested you can open a PR (or an issue, but then it might take a bit of time).

That being said, NumPy has switched generators to be very similar to prrng. So maybe we should consider an overall unification with the new NumPy API in the future.

tdegeus avatar Jul 26 '22 08:07 tdegeus

So xtensor uses MT19937 if I am understanding this header correctly, https://github.com/xtensor-stack/xtensor/blob/308458574baa95b0dabb981a4bae97fa29ff673f/include/xtensor/xrandom.hpp#L41

But even if you set numpy to also use MT19937 as the generator the results still come out different,

prng = np.random.Generator(np.random.MT19937(1))
prng.integers(low=0, high=1000, size=100)

array([240, 900, 729, 108, 558, 930, 526, 177, 146, 330, 907, 153, 466,
       686, 435, 151, 169, 937, 278,  28, 742, 935, 413, 359,  46, 357,
        46, 774, 944, 994, 588, 110,  58, 827, 186, 897,  60, 455, 970,
       273, 791, 797, 949, 423, 317, 866, 667, 192,  81, 374, 567, 617,
        94, 773, 688, 462, 415, 724, 134, 700, 960,  74, 980, 981, 949,
       208, 109, 590, 185, 856, 972, 802, 447, 678, 859,  79, 893, 962,
       280, 134, 751, 945, 890, 349,  67, 358, 760, 781, 496, 301, 267,
        94, 706, 555, 108, 161, 235,   8, 705, 661], dtype=int64)

@tdegeus Is this entirely down to your point about the variation across platforms and compilers ?

DuncanM-XR avatar Jul 26 '22 13:07 DuncanM-XR

I cannot say for certain. Is your seed the same?

What I can say is that too much remained unclear to me to depend on it for my research. To that end I created prrng.

tdegeus avatar Jul 26 '22 14:07 tdegeus

Yeah the seed is 1 in the numpy example I posted and in the xtensor example.

It may also be something on numpy's side that is causing it to be different, rather than xtensor, but as you say it is unclear.

DuncanM-XR avatar Jul 26 '22 14:07 DuncanM-XR

You are more than welcome to dive into this, and propose improvements if you find them. Similarly, you are more than welcome to use prrng and suggest improvements for it

tdegeus avatar Jul 26 '22 16:07 tdegeus