improved-diffusion icon indicating copy to clipboard operation
improved-diffusion copied to clipboard

How can I terminate the training?

Open ykcheng9966 opened this issue 1 year ago • 2 comments

This training seems to lack the concept of epochs and is trained batch by batch. How can I obtain the model with the best parameters? Could someone provide an answer? Thank you!

ykcheng9966 avatar Jun 15 '23 08:06 ykcheng9966

Hi! Usually the training goes infinitely, however every 10000th step, there is a checkpoint file saved in your /tmp directory if not specified otherwise. You can also change the step either in the image_train.py script or hand it as a --save_interval cmdline argument.

nicolasfischoeder avatar Jun 16 '23 20:06 nicolasfischoeder

Thank you for your help. I have another question to ask you. How can I modify hyperparameters to save the optimal model?

Hi! Usually the training goes infinitely, however every 10000th step, there is a checkpoint file saved in your /tmp directory if not specified otherwise. You can also change the step either in the image_train.py script or hand it as a --save_interval cmdline argument.

ykcheng9966 avatar Jun 18 '23 08:06 ykcheng9966