MatchZoo-py
MatchZoo-py copied to clipboard
Use of trainer/runer and number of training epochs
Hi, I have a question regarding choosing the epochs and doing hyperparameter tuning in general.
I am currently using matchzoo.trainers.trainer
to train my models with the default number of epochs(=10).
Does this always end training in epoch=10, or it keeps some sort of checkpoints and then restores the checkpoint/model in the epoch were the validation result is best? This is not very clear to me from the documentation, and there's a lot of confusion given that there are different tutorials/documentations in matchzoo and matchzoo-py.
Apart from that, my question is:
-
If training stops always on the 10th epoch, how can I make it stop and restore the model that achieves the best results based on a metric from the validation score? Ideally, I would like to do this with checkpoints, rather than using
matchzoo.auto.tuner.tuner
and re-training the model over and over, or some sort of other hacky solution. I guess there should be already something in place to do that. -
If the trainer indeed restores the checkpoint with the highest score, after the 10 epochs are finished running: Which metric is used to determine the highest score? Is it just the first metric in the list of
task.metrics
?
Thank you for your help!
@littlewine have you addressed this issue? In fact, the epoch number to save the checkpoints could be set in advance.