skorch
skorch copied to clipboard
Checkpoint dirname argument can also be a callable
Solves #848
Description
For Checkpoint
, as of now, dirname
can only be a string. With this
update, it can also be a callable with no arguments that returns a string.
What this solves is that the directory that a model is saved in can now
contain a dynamic element. This way, if you run, e.g., grid search with
n_jobs>1
+ checkpoint, each checkpoint instance can have its own
directory name (e.g. using a function that returns a random name), while
the files inside the directory still follow the same naming.
Without such a possibility, if a user runs grid search with n_jobs>1
and
checkpoint with load_best=True
, the loaded model would always be
whatever happens to be the latest one stored, which can result
in (silent) errors.
If anyone has a better idea how to solve the underlying problem, I'm open to it.
Implementation
As a consequence of the dirname
now not being known at __init__
time, I
removed the validation of the filenames from there. We still validate
them inside initialize
, which is sufficient in my opinion.
In theory, we could call the dirname
function inside __init__
to
validate it, and then call it again inside initialize
to actually set
it, but I don't like that. The reason is that we would call a
function that is possible non-deterministic or might have side effects
twice, with unknown consequences. This should be avoided if possible.
Example
Before, code like this would fail:
params = {
'module__num_units': [10, 20, 30],
}
net = NeuralNetClassifier(
ClassifierModule,
callbacks=[Checkpoint(f_history=None, load_best=True)]
)
gs = GridSearchCV(net, params, refit=False, cv=3, scoring='accuracy', n_jobs=3)
gs.fit(X, y)
# error message:
# RuntimeError: Error(s) in loading state_dict for ClassifierModule:
# size mismatch for dense0.weight: copying a param with shape torch.Size([20, 20]) from checkpoint, the shape in current model is torch.Size([10, 20]).
With this feature, we can do:
import random
def make_dirname():
return f"test-dir-{random.randint(0, 1_000_000)}"
net = NeuralNetClassifier(
ClassifierModule,
callbacks=[Checkpoint(f_history=None, load_best=True, dirname=make_dirname)]
)
gs = GridSearchCV(net, params, refit=False, cv=3, scoring='accuracy', n_jobs=3)
gs.fit(X, y)
and it works.