skorch Method to set random state for all components

We need a method (possibly on the wrapper class) to initialize the random state for all components that are concerned with sampling. These include

the model (e.g. weight init, dropout)
DataLoader (batch shuffling)
GridSearchCV split

Aug 07 '17 09:08 ottonemo

We don't want a method for setting random states for everything, rather we want to enable setting the random state everywhere where it is needed.

Oct 02 '17 11:10 ottonemo

For all pytorch related things setting the seed using torch.manual_seed suffices. We are settled there I think. Are all sklearn random states exposed to the outside?

Dec 08 '17 10:12 ottonemo

How can I set deterministically the seed for skorch? With the same Neural Network model I obtain 2 different results on 2 different computers despite I handled all the random_seeds. So my conclusion is that maybe the problem lies in skorch..

May 02 '19 15:05 amefabris

In the abstract, it's quite hard to say what the reason could, so if you have a minimal code example to reproduce the behavior, that would be great. In general, you need to think of the following sources of randomness:

torch.manual_seed
torch.cuda.manual_seed
numpy.seed

If you use anything from sklearn or pandas, don't forget to fix random_state. If you use the internal CV split, try without it (by passing train_split=None).

May 02 '19 17:05 BenjaminBossan

It may be time to add a random_state keyword to NeuralNet

May 03 '19 21:05 thomasjpfan

What would you use it for? I only see CVSplit at the moment.

May 04 '19 09:05 BenjaminBossan

To set the random seed for torch.manual_seed, torch.cuda.manual_seed, numpy.seed, and CVSplit, all at once?

May 04 '19 14:05 thomasjpfan

No, I don't think that it should be skorch's job to set seeds for torch and numpy. I could see a helper function that does it, but otherwise I would leave that to the user. Also, what would skorch do if a numpy.random.RandomState is passed?

What needs some rework is the fact that the random_state cannot be easily passed to CVSplit. Maybe we could change CVSplit to be uninitialized, so that users can have NeuralNet(..., train_split__cv=7, train_split__stratified=False, train_split__random_state=42).

May 04 '19 17:05 BenjaminBossan

No, I don't think that it should be skorch's job to set seeds for torch and numpy. I could see a helper function that does it, but otherwise I would leave that to the user.

Scikit-learn classifiers that has random state allows for a random_state keyword in __init__. For example, the MLPClassifer has a randon_state, which is used to initialize hidden layers.

Also, what would skorch do if a numpy.random.RandomState is passed?

This is the blocker.

Maybe we could change CVSplit to be uninitialized

This would fix the issue for CVSplit.

For reference, there was a previous discussion about this topic here: https://github.com/skorch-dev/skorch/issues/280

May 05 '19 23:05 thomasjpfan

To resolve this issue, is the goal to have a function like skorch.utils.set_random_state(pytorch_random_seed, numpy_random_state, ...) and call this before doing anything?

May 06 '19 14:05 thomasjpfan

Scikit-learn classifiers that has random state allows for a random_state keyword in __init__

We could do this. At the moment, I only see CVSplit as a potential target for this, though, and as long as CVSplit is passed in initialized form, having the random_state init parameter would not help. On the other hand, if CVSplit is uninitialized and it's possible to pass train_split__random_state, having random_state on NeuralNet would be useless.

To resolve this issue, is the goal to have a function like skorch.utils.set_random_state(pytorch_random_seed, numpy_random_state, ...) and call this before doing anything?

I guess it wouldn't hurt to have such a function. Ideally, for me at least, I would like to call it like this: set_random_seed(0), instead of set_random_seed(0, 0, 0).

May 06 '19 21:05 BenjaminBossan

skorch skorch copied to clipboard

Method to set random state for all components

skorch
skorch copied to clipboard