Different result on different machines for same input and random seed
Hi,
I am getting different results on different machines for the same inputs and random seed. Though the results on any single machines are consistent.
My questions are -
- What might be the possible reasons behind the same?
- What can be done to avoid it?
Thanks.
Within the library all sources of random behaviour are deferred to an instance of np.random.RandomState. This is used for random number generation when needed, but is also passed as a parameter to sklearn's GP. I don't know what else can be done, to be honest.
Thanks @fmfn for the quick reply. In that case, Can you please let me know whether you were previously aware of this issue or this is the first time you encountered this via this thread?
After np.random.RandomState was added everywhere I imagined this would not be a problem, but never really tested it, to be honest.
To my knowledge the package is setting a seed to all possible sources of randomness that allow for seed control. I suspect scipy.minimize might be the source of problem here, but, as I said, not sure.
Help with debugging this is more than appreciated. I won't have the opportunity to chase this bug down for a little while still.
Yes, I suspect the same. But I don't have time this weekend. I will give it a shot next weekend.
When you know the solution, please post it in the comments, thank you very much!
there is a same problem in skopt..... https://github.com/scikit-optimize/scikit-optimize/issues/682
@jiayeah9508 Hi, Do you mind sharing the processor information of your two machines on which you have tested the code that you posted in #682?
Specifically, I am interested in the output of the below command. I have a hunch which I need to confirm.
cat /proc/cpuinfo | grep "model name"
I used windows system so I can not cat... the first one is Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz 2.40GHz and the second one is Intel(R) Core(TM)2 Duo CPU E7500 @2.93GHz 2.94GHz I wonder if it meets your requirements~
Yes, this is what I needed and this strengthened my hunch. Here is what I feel about the underlying problem.
The Root cause I believe is the underlying instructions set of the machine. Older machines which are configured to use SSE4.2 instruction set behave differently than the new ones using AVX. It is possible that numpy/scipy operate differently under these instructions set. I am not still 100% sure about this hunch since I do not have access to a SSE4.2 only machine.
The first time I encountered this bug was on running test cases on a jenkins server to which I do not have an SSH access. I will try to get access to the server and see if my current hypothesis is correct or not.
For reference, this link might help which points at the same problem.
How about AMD processor? optimization becomes more and more important for deeplearning... That seems like bad news for Intel,haha..,