PyTorch-CycleGAN icon indicating copy to clipboard operation
PyTorch-CycleGAN copied to clipboard

PatchGAN loss calculated incorrectly (perhaps)

Open alexander-soare opened this issue 4 years ago • 3 comments

Thanks for the great work @aitorzip I think I've spotted a mistake in the way the GAN loss is calculated.

In the forward of your Discriminator class you average the activations first, and then you calculate the loss based on the resulting value. Actually, I think you should calculate the per neuron loss first then average the result of that. When you do the math one way vs the other you will find the losses are not the same.

I noticed you haven't committed since 2017 but if you see this and agree I'd be happy to make a PR.

alexander-soare avatar May 21 '20 17:05 alexander-soare

@alexander-soare I was wondering whether if it makes too much of a difference in the results? Have you tried comparing the two?

naga-karthik avatar Oct 22 '20 22:10 naga-karthik

@naga-karthik it's been a while since but I think I remember determining that the impact was equivalent to a scalar multiple of the gradient, therefore it only affects the learning rate parameter. Don't take me word for it though as it's been a while.

alexander-soare avatar Oct 23 '20 09:10 alexander-soare

@naga-karthik it's been a while since but I think I remember determining that the impact was equivalent to a scalar multiple of the gradient, therefore it only affects the learning rate parameter. Don't take me word for it though as it's been a while.

Late to the party here but you are correct. This really needs to be changed. The current implementation misses the point of the patchGAN paper entirely. If you are going to average the patches before performing mse, you might has well add a dense layer and make a global discriminator because you just obscured all the local information.

Unfortunately, this is not just a matter of scaling the learning rate. This definitely affects the gradient propogation.

jbeck9 avatar Jan 24 '22 04:01 jbeck9