kge icon indicating copy to clipboard operation
kge copied to clipboard

Saving checkpoint before evaluation

Open AdrianKs opened this issue 3 years ago • 1 comments

Currently we are storing checkpoints after evaluation. If we for some reason encounter an error during evaluation (e.g. OOM) we will lose the complete epoch. Therefore, we should store the checkpoint before (or even while) we run the evaluation code.

AdrianKs avatar Sep 16 '21 14:09 AdrianKs

I think this problem can be avoided by evaluating the model before the training phase as saving the checkpoints after the evaluation can provide the best model and well support the early stopping.

AprLie avatar Oct 02 '22 15:10 AprLie