MatchZoo-py Use of trainer/runer and number of training epochs

Use of trainer/runer and number of training epochs

Open littlewine opened this issue 4 years ago • 1 comments

Hi, I have a question regarding choosing the epochs and doing hyperparameter tuning in general.

I am currently using matchzoo.trainers.trainer to train my models with the default number of epochs(=10).

Does this always end training in epoch=10, or it keeps some sort of checkpoints and then restores the checkpoint/model in the epoch were the validation result is best? This is not very clear to me from the documentation, and there's a lot of confusion given that there are different tutorials/documentations in matchzoo and matchzoo-py.

Apart from that, my question is:

If training stops always on the 10th epoch, how can I make it stop and restore the model that achieves the best results based on a metric from the validation score? Ideally, I would like to do this with checkpoints, rather than using matchzoo.auto.tuner.tuner and re-training the model over and over, or some sort of other hacky solution. I guess there should be already something in place to do that.
If the trainer indeed restores the checkpoint with the highest score, after the 10 epochs are finished running: Which metric is used to determine the highest score? Is it just the first metric in the list of task.metrics?

Thank you for your help!

May 07 '20 14:05 littlewine

@littlewine have you addressed this issue? In fact, the epoch number to save the checkpoints could be set in advance.

Sep 20 '20 07:09 faneshion

MatchZoo-py MatchZoo-py copied to clipboard

Use of trainer/runer and number of training epochs

MatchZoo-py
MatchZoo-py copied to clipboard