liblinear Fixed bug : random number generator under Windows 32/64

Fixed bug : under windows, some problems required a lot more iterations to converge, than the hardcoded max number (for example this max number is 1000 in solve_l2r_l1l2_svc). Therefore we had to recompile the code with a larger max number to make it reach the same results than under Linux.

This issue is explained in the liblinear FAQ under "Windows Binary Files" section but the fix provided in the FAQis actually incorrect and not valid for both 32-bit and 64-bit systems.

This commit fixes the random number generator for windows, to make it work the same way than in Linux. It also includes an auto-check performed at startup.

Nov 17 '16 14:11 smarie

Another solution may be implementing a simple random number generator according to glibc's rand(). (which is roughly a simple linear congruential generator)

please refer to the source code in glibc. val = ((state[0] * 1103515245) + 12345) & 0x7fffffff; It can be implemented by only few lines of code.

Actually, in multi-core LIBLINEAR, we implement rand() by our own to ensure thread safety related issues of glibc's rand().

Nov 18 '16 01:11 infwinston

@smarie btw, could you explain why the fix in FAQ is incorrect? do you mean that by applying the above modification, the behavior in windows is still not the same as the one in linux?

Nov 18 '16 17:11 infwinston

@infwinston as documented in the code patch: max random number in Windows is 15 bits, which is 32767. max random number in linux+GCC is 31 bits (resp. 63 bits in 64 bits systems I guess) so that's 2147483647 (resp 9223372036854775807).

The fix provided in the FAQ proposes to use (rand()*32768+rand()), that means, ((rand()<<15)+rand()), which leads to a 30 bits (15+15) random number instead of a 31bits one. Hence my "quick fix" (that is added only when compiling for windows targets):

first I check the maximum int available using std::numeric_limits::max(): is it 31 or 63.
then I generate a proper random number covering that number of bits, for example if it's 31 bits, ((rand() << 16) + (rand() << 1) + (rand() >> 14)); that shifts the FAQ suggestion one bit more to the left and adds the last missing bit with (rand() >> 14). Similarly if it is 63, I use ((rand() << 48) + (rand() << 33) + (rand() << 18) + (rand() << 3) + (rand() >> 12));
Under windows platform, I added an auto-check at startup that performs the same shifting/adding operation on 32767 (the max rand() on windows), and compares it with the maximum integer available (std::numeric_limits::max()). If there is a difference, the program stops and throws an error.

Notes

there might be better ways - I'm no C++ expert - but that's all I found to perform a quick fix.
I don't know if adding consecutive random numbers generated by the same random number generator preserve the randomness distribution property, most probably not.
A quick look on google provides plenty of other ways to go. Many people recommend the boost library, but maybe that's overkill here in terms of size.. The links I found, for what's worth:
- http://stackoverflow.com/questions/3958795/different-rand-results-on-windows-and-linux
- http://stackoverflow.com/questions/32730906/random-generates-same-number-in-linux-but-not-in-windows
- https://github.com/cmcqueen/simplerandom
- http://www-cs-faculty.stanford.edu/~knuth/news02.html#rng
- http://stackoverflow.com/questions/34903356/c11-random-number-distributions-are-not-consistent-across-platforms-what-al
- http://stackoverflow.com/questions/922358/consistent-pseudo-random-numbers-across-platforms

Probably the best would be to use your implementation from multi-core liblinear to ensure consistency across the two libs ? I did not check what you actually implemented there. Thanks anyway for these great libs!

Nov 21 '16 08:11 smarie

@smarie Thanks for your detailed explanation. I understand your point now. What I suggested is that we can implement a stand-alone random number generator, which is the same as rand() (TYPE_0 one) in glibc. In our scenario, we might not need strong randomness, so I think implementing a simple rng is easier (< 10 lines of code) and a direct way to solve the inconsistency issue between linux and windows. Thanks again for pointing out this!

Nov 24 '16 04:11 infwinston

Super. That way, other windows users will be able to reach convergence instead of having to increase max_iterations to a ridiculously high level and recompile :).

Meanwhile in the FAQ you could maybe change the proposed fix to ((rand() << 16) + (rand() << 1) + (rand() >> 14)) or equivalently to ((rand()*65536)+(rand()*2)+(rand()/16384)) since apparently even windows 64 has 32-bit int (http://stackoverflow.com/questions/384502/what-is-the-bit-size-of-long-on-64-bit-windows, thanks @tavianator ).

Some additional thoughts: on 32-bit integer systems (including windows 64 in this case then), or on very difficult (ill-posed ? :) ) problems I guess, it might still be useful to easily increase the max number of iterations - we never know... Providing an option in the API to increase it (including in the wrappers: mex, etc...) would provide an additional lever to the users facing these cases. Thanks again - looking forward to seeing the next versions ! :)

Nov 24 '16 09:11 smarie

Good to see this work on Windows support. Can I make some style suggestions?

To keep the main code focused on SVM, I'd suggest to move windows-specific code to a separate file. In the main code, this could just be a function call like #ifdef _WIN32 windows_check_rand() #endif or so. In Makefile.win this file could be included.
Elsewhere Malloc is being used to choose the appropriate malloc implementation. Wouldn't it be cleaner to use Rand(), and have a header file determine whether that is just rand() or something else? (Might consider adding check_rand() in the mix too with an #ifdef.)
Right now, the algorithm to expand the random number is duplicated in the check function. This could be eliminated by passing a function argument to the random (helper) function.

Combining this becomes something like:

// in header
#ifdef _WIN32
 #define Rand windows_random
 #define Check_random windows_check_random
#else
 #define Rand rand
 #define Check_random ((void)0) // nothing to do
#endif

// in windows.c
int _windows_random(int (*rand)()) {
 // if ...
 return /* ... */ rand() /* ... */;
}

int windows_random() {
 return _windows_random(rand);
}

const char *windows_check_random() {
 if (std::numeric_limits<int>::max() != _windows_random(0x7fff)) {
   // ...
 }
 // ...
}

And ... please feel free to disagree, I think the code is already a step forward for Windows users. Don't let my comment hold off merging.

Feb 13 '17 11:02 wvengen

+1: thanks @wvengen for the feedback

Mar 07 '17 13:03 smarie

Hi, I came across this issue again by seeing it on scikit-learn. I took this opportunity to

fix the C4293 warning messages by using this recommended solution
submit an updated version of the windows binaries including mex files
submit a fix for libsvm since it has the same issue: https://github.com/cjlin1/libsvm/pull/140

Can you please have a look and merge ? That would help windows users have correct results, instead of non-converged ones.

Mar 25 '19 16:03 smarie