SwapNet icon indicating copy to clipboard operation
SwapNet copied to clipboard

Training Warp stage stops at epoch 3

Open phenomenal-manish opened this issue 5 years ago • 3 comments

Hi,

I ran the train.py for training warp stage twice. (python train.py --name deep_fashion/warp --model warp --dataroot data/deep_fashion) However, the training does not proceed beyond epoch 3. Could you help me with this issue? I have attached screenshots for reference.

IMG-20200617-WA0001 Capture

phenomenal-manish avatar Jun 17 '20 07:06 phenomenal-manish

Hi! Sorry I'm not sure what the issue is. I haven't encountered this before.

andrewjong avatar Jun 18 '20 18:06 andrewjong

One thing I could notice is that when loss values are exactly same, the execution freezes. Have you used callbacks or something which stops the training? I went through the code, but could not find such things.

phenomenal-manish avatar Jun 18 '20 18:06 phenomenal-manish

One thing I could notice is that when loss values are exactly same, the execution freezes. Have you used callbacks or something which stops the training? I went through the code, but could not find such things.

The loss values are the same may come from the printing format for %.3f. Please try in visualizer.py | line 242 message+ = '%s: %.3f ' --> if you increase to %.6f it may show the difference between iterations?

For your issues, may be you can debug and check the values of: opt.start_epoch + 1, opt.n_epochs + 1

tuan-seoultech avatar Jul 01 '20 04:07 tuan-seoultech