tacotron icon indicating copy to clipboard operation
tacotron copied to clipboard

can you/how can you continue training?

Open aelbialy-tbox opened this issue 7 years ago • 5 comments

Hi guys,

Is it possible to continue training from where it left off? I know it is a silly question. I know that it saves checkpoints after each epoch for generating the audio. Is there a way carry on training after it has been shutdown?

Thank you! Starting to get alright results too

aelbialy-tbox avatar Jul 14 '17 15:07 aelbialy-tbox

If you didn't mess with the code - it should continue training from the latest saved checkpoint.

chief7 avatar Jul 14 '17 15:07 chief7

When I type in python3 train.py and I print the current epoch it starts again from 1. Is there a way to check that it continues training?

aelbialy-tbox avatar Jul 14 '17 15:07 aelbialy-tbox

As far as I remember, epochs aren't saved with the model. It's just a range that the code iterates over. You may want to check with tensorboard and see if your graphs start from scratch (they shouldn't)

chief7 avatar Jul 14 '17 16:07 chief7

@ahmed-tbox it would continue training from the latest saved checkpoint. you can see Supervisor in tensorflow . here is the guide.

zuoxiang95 avatar Jul 15 '17 01:07 zuoxiang95

Is it possible to continue training from another model that was run on a different machine? I used a machine that ran on 1 GPU with train.py and on the new machine I am using 2 GPUs with train_multi_gpus.py will that affect anything? What files do I need to transfer over if I need to?

aelbialy-tbox avatar Jul 17 '17 17:07 aelbialy-tbox