Palette-Image-to-Image-Diffusion-Models
Palette-Image-to-Image-Diffusion-Models copied to clipboard
resume training
Hello,
I am trying to resume training for celeb dataset from your check point, I changed the inpainting_celebahq.json like the instructions.
"path": { //set every part file path
"base_dir": "experiments", // base path for all log except resume_state
"code": "code", // code backup
"tb_logger": "tb_logger", // path of tensorboard logger
"results": "results",
"checkpoint": "checkpoint",
"resume_state": "experiments/train_inpainting_celebahq_221006_180531/checkpoint/200",
"resume_state": "200"
// "resume_state": null // ex: 100, loading .state and .pth from given epoch and iteration
},
I inserted the 200.state and 200_Network.pth in the "200" folder, but after running the command
python run.py -p train -c config/inpainting_celebahq.json
The training doesn't start. I don't even get an error, I only get Close the Tensorboard SummaryWriter.
in the ouput.
What is the correct way of resuming the training.
Sorry for this error, and you can comment out the try in https://github.com/Janspiry/Palette-Image-to-Image-Diffusion-Models/blob/136b29f58d0af6e5db9f3655d2891f5a855fcdaa/run.py#L56 to get the exact traceback info.