OpenNMT-py
OpenNMT-py copied to clipboard
Added saving of best checkpoint during early stopping
With reference to the issue here https://github.com/OpenNMT/OpenNMT-py/issues/1856 and the pull request discussion here https://github.com/OpenNMT/OpenNMT-py/pull/1858 . Best checkpoint will now not be removed when early stopping happens.
Not sure why you apply this logic only when early_stopped is triggered.
What happens if you have -keep_checkpoints
< -early_stopping
(tolerance) --> you could still remove your best checkpoint.
This is the best I could think of
- In trainer.py added force validation at every save_checkpoint_steps.
- While saving the model if it is triggered by early_stopping best step is also sent otherwise just current step's validation perlexity is sent.
- In model_saver.py the following conditions are checked a.first save of checkpoint or b. better than previous best c. If initial best_step is not none ( triggered by early_stop ) and is still the best ( since step != best_step in this case we cannot consider step to be the best since early stop's last step need not contain the best ckpt unless in case it is triggered by stalling. ) else already existing best ckpt from config is taken.
- Best ckpt not removed during the removal.
- best_ckpt_config.json always has the best step and perplexity even when early_stopping is not set intially when training is started.
Sorry if this is getting long this is the best I could think to cover all cases. Suggestions welcome. Thanks
@francoishernandez any update on this ?