allenact
allenact copied to clipboard
Properly resume training with offpolicy losses
Currently, a new epoch will be started when resuming training (new Iterator will be instantiated). We should save the random seed used to shuffle the datasets (for all workers?) and the length of the remaining data, besides enforcing a resume API for iterators.