xnmt
xnmt copied to clipboard
Feature request: saving current model at regular intervals
It would be nice to be able to save the current model in a temporary file every x number of iterations/every x minutes, or even better save the state of training (current batch, optimizer state, weights).
This would help restart training in case of an unexpected interruption (out of memory, crash,...), especially when the training set is very large and there are potentially many hours between two evaluations
This can already be done with the dev_every option. This triggers an evaluation on the dev set, but I think that should be OK? Does this seem like a reasonable solution?
With regards to saving the state of training, that would be good but require enhancements to DyNet.