EverybodyDanceNow_reproduce_pytorch icon indicating copy to clipboard operation
EverybodyDanceNow_reproduce_pytorch copied to clipboard

How to be sure that training continue from last training after break the training?

Open talatccan opened this issue 5 years ago • 5 comments

Hi,

Im trying to continue training from last saved model after break the training. At the first i started training and saved first epoch in checkpoints and stopped it. Afterall i set load_pretrain = './checkpoints/target/ in ./src/config/train_opt.py and started training again but it started from epoch 1 like before. I was thinking it will continue from epoch 2.

How can i be sure its continuing from the last saved epoch?

talatccan avatar Oct 19 '19 10:10 talatccan

Same issue , need help

andrewhani14 avatar Aug 02 '20 13:08 andrewhani14

Same issue. Did you guys figure it out?

zibozzb avatar Sep 03 '20 04:09 zibozzb

So, the problem is here if you print out those value you will figure out that this line won't work while training so i just assign the pretrained_path right before this statement manually and it eventually work.

iluvrachel avatar Sep 04 '20 06:09 iluvrachel

So, the problem is here if you print out those value you will figure out that this line won't work while training so i just assign the pretrained_path right before this statement manually and it eventually work.

Thank you for your reply. I tried to print them out, however, the value of "pretrained_path" is "./checkpoints/target/" which is correct I guess. The issue is the training will start from epoch 1 rather than the results of the last training. I am not sure if it will continue training. In addition, a new log file (in ./checkpoints/target/logs) will be created rather than keep updating the previous log file.

zibozzb avatar Sep 04 '20 14:09 zibozzb

In the train opt file, there are two 'load_pretrain' args, you should delete one.

In additon, although the loaded model is correct, the print log is still from 1 start, maybe you can change the 'start_epoch' variable in the 'train_pose2vid.py'.

ShawnDong98 avatar Feb 24 '21 02:02 ShawnDong98