Palette-Image-to-Image-Diffusion-Models icon indicating copy to clipboard operation
Palette-Image-to-Image-Diffusion-Models copied to clipboard

resume training

Open hodfa840 opened this issue 2 years ago • 1 comments

Hello,

I am trying to resume training for celeb dataset from your check point, I changed the inpainting_celebahq.json like the instructions.

  "path": { //set every part file path
        "base_dir": "experiments", // base path for all log except resume_state
        "code": "code", // code backup
        "tb_logger": "tb_logger", // path of tensorboard logger
        "results": "results",
        "checkpoint": "checkpoint",
        "resume_state": "experiments/train_inpainting_celebahq_221006_180531/checkpoint/200",
        "resume_state": "200"
        // "resume_state": null // ex: 100, loading .state  and .pth from given epoch and iteration
    },

I inserted the 200.state and 200_Network.pth in the "200" folder, but after running the command

python run.py -p train -c config/inpainting_celebahq.json

The training doesn't start. I don't even get an error, I only get Close the Tensorboard SummaryWriter. in the ouput. What is the correct way of resuming the training.

hodfa840 avatar Oct 07 '22 17:10 hodfa840

Sorry for this error, and you can comment out the try in https://github.com/Janspiry/Palette-Image-to-Image-Diffusion-Models/blob/136b29f58d0af6e5db9f3655d2891f5a855fcdaa/run.py#L56 to get the exact traceback info.

Janspiry avatar Oct 08 '22 14:10 Janspiry