PyTorch-CycleGAN icon indicating copy to clipboard operation
PyTorch-CycleGAN copied to clipboard

Training did not converge

Open Lmy0217 opened this issue 7 years ago • 2 comments

I got loss as follows

...
2018-10-27 Saturday 02:21:32 : Train Epoch: 1 [213/284 (75%)]   loss_G: 13.778452       loss_G_identity: 4.409110       loss_G_GAN: 0.245768    loss_G_cycle: 9.123573  loss_D: 1.171054
2018-10-27 Saturday 02:21:33 : Train Epoch: 1 [214/284 (75%)]   loss_G: 13.634476       loss_G_identity: 3.904183       loss_G_GAN: 0.687696    loss_G_cycle: 9.042597  loss_D: 0.879381
2018-10-27 Saturday 02:21:33 : Train Epoch: 1 [215/284 (76%)]   loss_G: 22.416130       loss_G_identity: 5.798176       loss_G_GAN: 0.030727    loss_G_cycle: 16.587227 loss_D: 1.798575
2018-10-27 Saturday 02:21:33 : Train Epoch: 1 [216/284 (76%)]   loss_G: 29.573841       loss_G_identity: 6.921859       loss_G_GAN: 0.317474    loss_G_cycle: 22.334511 loss_D: 3.714886
2018-10-27 Saturday 02:21:33 : Train Epoch: 1 [217/284 (76%)]   loss_G: 36.265060       loss_G_identity: 8.680099       loss_G_GAN: 0.797719    loss_G_cycle: 26.787243 loss_D: 3.788987
2018-10-27 Saturday 02:21:33 : Train Epoch: 1 [218/284 (77%)]   loss_G: 35.755966       loss_G_identity: 8.223778       loss_G_GAN: 1.697493    loss_G_cycle: 25.834694 loss_D: 10.536936
2018-10-27 Saturday 02:21:34 : Train Epoch: 1 [219/284 (77%)]   loss_G: 39.954712       loss_G_identity: 8.718156       loss_G_GAN: 3.962042    loss_G_cycle: 27.274513 loss_D: 10.759437
2018-10-27 Saturday 02:21:34 : Train Epoch: 1 [220/284 (77%)]   loss_G: 47.431328       loss_G_identity: 9.417677       loss_G_GAN: 9.492589    loss_G_cycle: 28.521061 loss_D: 16.836094
2018-10-27 Saturday 02:21:34 : Train Epoch: 1 [221/284 (78%)]   loss_G: 68.834702       loss_G_identity: 11.180630      loss_G_GAN: 24.363785   loss_G_cycle: 33.290287 loss_D: 56.229206
2018-10-27 Saturday 02:21:34 : Train Epoch: 1 [222/284 (78%)]   loss_G: 76.460297       loss_G_identity: 8.798868       loss_G_GAN: 39.637287   loss_G_cycle: 28.024141 loss_D: 56.287437
2018-10-27 Saturday 02:21:35 : Train Epoch: 1 [223/284 (79%)]   loss_G: 90.395309       loss_G_identity: 9.412882       loss_G_GAN: 52.389076   loss_G_cycle: 28.593346 loss_D: 93.407372
2018-10-27 Saturday 02:21:35 : Train Epoch: 1 [224/284 (79%)]   loss_G: 124.559311      loss_G_identity: 8.107519       loss_G_GAN: 92.771484   loss_G_cycle: 23.680317 loss_D: 122.070450
2018-10-27 Saturday 02:21:35 : Train Epoch: 1 [225/284 (79%)]   loss_G: 197.886826      loss_G_identity: 10.789768      loss_G_GAN: 154.939728  loss_G_cycle: 32.157345 loss_D: 258.935150
2018-10-27 Saturday 02:21:35 : Train Epoch: 1 [226/284 (80%)]   loss_G: 203.583557      loss_G_identity: 8.484806       loss_G_GAN: 169.029099  loss_G_cycle: 26.069662 loss_D: 365.472229
2018-10-27 Saturday 02:21:36 : Train Epoch: 1 [227/284 (80%)]   loss_G: 277.741608      loss_G_identity: 9.769445       loss_G_GAN: 238.781219  loss_G_cycle: 29.190956 loss_D: 392.631653
2018-10-27 Saturday 02:21:36 : Train Epoch: 1 [228/284 (80%)]   loss_G: 254.521164      loss_G_identity: 11.291607      loss_G_GAN: 215.258072  loss_G_cycle: 27.971478 loss_D: 557.096924
2018-10-27 Saturday 02:21:36 : Train Epoch: 1 [229/284 (81%)]   loss_G: 359.739563      loss_G_identity: 10.633758      loss_G_GAN: 319.721008  loss_G_cycle: 29.384762 loss_D: 647.901001
2018-10-27 Saturday 02:21:36 : Train Epoch: 1 [230/284 (81%)]   loss_G: 419.248474      loss_G_identity: 8.859273       loss_G_GAN: 385.754211  loss_G_cycle: 24.635035 loss_D: 789.995117
2018-10-27 Saturday 02:21:37 : Train Epoch: 1 [231/284 (81%)]   loss_G: 470.745880      loss_G_identity: 10.386059      loss_G_GAN: 431.400543  loss_G_cycle: 28.959274 loss_D: 903.578308
...

Is something wrong? what should I do?

Lmy0217 avatar Oct 28 '18 11:10 Lmy0217

hey bro, I have the same issue as yours. The first 2 epochs were just fine, but the third epoch loss become millions.

I tried to delete the lr_scheduler, keep it at about 0.0002 and then it converges just fine (my own dataset).

You can have a try:)

xyy-Iv avatar Nov 13 '18 03:11 xyy-Iv

thanks bro,(I tried to delete the lr_scheduler, keep it at about 0.0002 and then it converges just fine(in Monet2Photo) ). it worked

1749352011 avatar Aug 06 '23 05:08 1749352011