ganhacks icon indicating copy to clipboard operation
ganhacks copied to clipboard

G loss increase, what is this mean?

Open wagamamaz opened this issue 8 years ago • 42 comments

Hi, I am training a conditional GAN. At the beginning, both G and D loss decrease, but around 200 epoch, G loss start to increase from 1 to 3, and the image quality seems to stop improve.

Any ideas? Thank you in advance.

wagamamaz avatar Jan 22 '17 21:01 wagamamaz

It's hard to say!

zhangqianhui avatar Jan 27 '17 03:01 zhangqianhui

Ok this is for an unconditional boilerplate GAN. What I found for loss increase in G was that: a) it was accompanied by a decrease in D loss. Essentially G starts diverging. b) image quality improved subtle but it did.

LukasMosser avatar Jan 29 '17 19:01 LukasMosser

I think the discriminator got too strong relative to the generator. Beyond this point, the generator finds it almost impossible to fool the discriminator, hence the increase in it's loss. I'm facing a similar problem.

vijayvee avatar Feb 13 '17 01:02 vijayvee

Have you tried label smoothing @vijayvee ?

LukasMosser avatar Feb 13 '17 20:02 LukasMosser

No I haven't tried it yet @LukasMosser

vijayvee avatar Feb 14 '17 02:02 vijayvee

I am facing similar problem while training infoGAN on svhn dataset. Any suggestion on how to overcome this? infogan_loss

dugarsumit avatar Jun 09 '17 07:06 dugarsumit

I am also facing similar problem with Infogan on a different​ dataset. Any suggestions?

cianeastwood avatar Jun 10 '17 18:06 cianeastwood

In my experience, when d loss decrease to a small value (0.1 to 0.2) and g loss increase to a high value (2 to 3), it means the training finish as generator cannot be further improved.

Bit if the d loss decrease to a small value in just few epochs, it means the training fail, and you may need to check the network architecture.

zsdonghao avatar Jun 15 '17 12:06 zsdonghao

I have the same problem. When i train GAN, i expect that in the end of training(some infinite moment) G will always fool D. But in fact I am faced with the following problem: at the beginning of the process, G learns correctly - it learns to produce good images with nessesary conditions. But after some moment G starts to diverge. In the end, G could produce only random noise. Why this happens?

ezamyatin avatar Jun 29 '17 14:06 ezamyatin

Probably, the problem is that the discriminator overfit. One of the reasons leading to this is following thing: discriminator may "notice" that images from true distribution is a matrix of numbers of the form n/255. So, adding gaussian noise to the input images may help to avoid the problem. It helps in my case.

ezamyatin avatar Jul 02 '17 18:07 ezamyatin

Label switching has also helped for me.

Two updates of discriminator with real_label = 1, fake_label=0 and one update with real_label=0 and fake_label=1.

This is followed by one generator update with real_label = 1 and fake_label = 0.

LukasMosser avatar Sep 20 '17 08:09 LukasMosser

Label smoothing helped for me.

shijx12 avatar Oct 09 '17 07:10 shijx12

Adding gaussian noise helped for me

Howie-hxu avatar Dec 26 '17 05:12 Howie-hxu

@Howie-hxu and @EvgenyZamyatin : I saw that adding Gaussian noise in the discriminator helped in your case. I have few questions :

  1. What did you keep as mean and variance of the Gaussian noise
  2. Did you apply Gaussian noise in each layer of the discriminator ? Lets say, if we are using DCGAN architecture ?
  3. Do you apply the noise layer after the activation or before doing convolution?
  4. If suppose, I am using Tensorflow, how do you implement that?

Keenly waiting for your help !!! Thanks, Avisek

avisekiit avatar Apr 22 '18 06:04 avisekiit

Same doubt here

SHANKARMB avatar Apr 23 '18 08:04 SHANKARMB

Same doubts as yours. @avisekiit

17Skye17 avatar May 07 '18 12:05 17Skye17

I have used the idea of instance noise described here. My experiment was to add the Gaussian noise only to the input tensor of the discriminator. It was zero mean and its standard deviation ranges from 0.1 to 0 (i.e. decaying with each mini batch iteration). This has improved the result much better for the MNIST dataset.

ahmed-fau avatar May 09 '18 06:05 ahmed-fau

Thank you!I'll try it @ahmed-fau

17Skye17 avatar May 14 '18 10:05 17Skye17

loss

Hello. I am training CycleGAN and my loss looks like that attached picture. The discriminator loss decreases but the generator loss fluctuates. I do not quite understand the reasons. Are there anyone have any suggestions? Thanks

phamnam95 avatar Jun 13 '18 14:06 phamnam95

Adding noise to input seems to help. To be specific, i am implementing with tensorflow by adding: input = input + tf.random_normal(shape=tf.shape(input), mean=0.0, stddev=0.1, dtype=tf.float32)

robot010 avatar Jun 13 '18 19:06 robot010

I agree here that by adding noise to the discriminator loss model function, it does improves your generator loss to decrease. @ahmed-fau suggested very good tips.

bjgoncalves avatar Aug 05 '18 23:08 bjgoncalves

Hi, I tried what you guys did, adding gaussian noise to the input of the discriminator. It does improve the graph, but however the test images generated by the generator comes out as noise as well. (previously I have relatively ok images, but my generator loss fn was going up).

Thoughts?

lppier avatar Aug 17 '18 09:08 lppier

Hi, I tried what you guys did, adding gaussian noise to the input of the discriminator. It does improve the graph, but however the test images generated by the generator comes out as noise as well. (previously I have relatively ok images, but my generator loss fn was going up).

Thoughts?

Did you also have decay of the noise after a while?

davesean avatar Oct 24 '18 13:10 davesean

@EvgenyZamyatin adding noise to input helped, great thanks

hi0001234d avatar Dec 19 '18 11:12 hi0001234d

I am facing a similar problem while using WGAN-GP. The generator initially produces good results but seems to diverge after some time and the discriminator loss suddenly dips and becomes very powerful making the generator output random noise. What can be done instead of label smoothing since I am using WGAN?

aradhyamathur avatar Dec 21 '18 10:12 aradhyamathur

@aradhyamathur could try adding a penalty loss term for the discriminator output magnitude, similarly to https://github.com/tkarras/progressive_growing_of_gans

This helps to prevent a training dynamic where the models engage in a "magnitudes race" and eventually lose any meaningful learning signals.

ljuvela avatar Dec 21 '18 11:12 ljuvela

@phamnam95 That looks like typical cycleGAN loss. What is your batchsize? If it is one or 2, there will be lots of fluctuations in your objective function. Seen it before, looks pretty normal to me.

LukasMosser avatar Dec 22 '18 11:12 LukasMosser

@LukasMosser My batch size is 1. After adding some more constraints such as the identity loss, the self-distance loss, and I also semi-supervised cyclegan by using pair images, I can get generator decreases but not fast, instead it decreases very slowly and seems that after 200 epochs, the trend is still decreasing. And the discriminator decreases until it reaches a certain of epochs, it starts fluctuate. What do you think will be good? What batch size do you think is appropriate?

phamnam95 avatar Dec 22 '18 15:12 phamnam95

Hi, I tried what you guys did, adding gaussian noise to the input of the discriminator. It does improve the graph, but however the test images generated by the generator comes out as noise as well. (previously I have relatively ok images, but my generator loss fn was going up).

Thoughts?

Hi, I have the same problem? Can you solve it?great thanks

libo1712 avatar Dec 24 '18 12:12 libo1712

@phamnam95 I think batch size = 1 is ok, I'm not really worried about the fluctuation it just means you'll have to pick one with appropriate generator loss and not one where it seemingly diverged.

LukasMosser avatar Dec 27 '18 14:12 LukasMosser