ganhacks Why Discriminator Loss 0 is a failure mode ?

In 10. you say Discriminator loss 0 is failure mode , but in the paper they say that,

What I'm getting wrong here ?

Thanks,

Mar 25 '18 07:03 arijitx

I think the whole point of GANs is to have losses that counterbalance one another. We are not as in traditionnal CNNs in the presence of one loss we wish to reduce as much as possible. The error you show from the paper is indeed D's error. But you must consider also G's error, which is the oppossite of D (this is not exactly true and is also implementation-dependent, but it's the intuition that loss D = - loss G). Therefore, in GAN cases, you don't want D loss to go to zero because that would mean that D is doing a too good job (and most importantly, G a too bad one), ie it can easily discriminate between fake and real data (ie G's creations are not close enough to real data).

To sum it up, it's important to define loss of D that way because we do want D to try and reduce this loss but the ultimate goal of the whole G-D system is to have losses balance out. Hence if one loss goes to zero, it's failure mode (no more learning happens).

Apr 05 '18 15:04 ghost

Hence if one loss goes to zero, it's failure mode (no more learning happens).

I wouldn't say that no more learning happens. For instance: let's say that at the beginning, the discriminator's loss goes to 0. But then, the generator gets improved and in next iteration, the synthetic observations are good enough to fool the discriminator. So it's loss increases.

Generally, I would focus on the training process being stable. My understanding is that at the very beginning, the discriminator's accuracy should be high (say 90%), meaning that it separates fake observations from real ones well. Then, it's loss should steadily decrease as the generator improves.

The perfect (final) state is when you:

have 100% accuracy for the generator - meaning the discriminator classifies all synthetic observations as real;
have about 50% accuracy for the discriminator - meaning it cannot distingiush fake observations from real ones;
the synthetic observations are of good quality.

The last point however is another story.

Jul 25 '18 07:07 mateuszkaleta

@mateuszkaleta AFAICT if discriminator loss goes to zero, there are no more loss gradients flowing (since these gradients are derivatives of loss), so weights of D and G are not modified, so the G cannot "get improved in next iteration" as you propose.

Jul 25 '18 08:07 ghost

What should I do to prevent a failure mode? Does anyone have any suggestions? Thanks!

Dec 23 '18 20:12 sunbau

When I trained a DCGAN on celebrity face dataset, my discriminator quickly converged to zero loss and no more learning happened. But I was able to solve this problem for my case.

The error was that I was using a sigmoid layer at the discriminator output and using binary cross entropy (BCE) loss on this output. Instead, when I didn't use the sigmoid layer and directly wrote BCE on logits, it worked like a charm.

This is a well-known problem of instability when dealing with exponentials and logarithms. Essentially, very high positive values for logits were approximated to 1 and very low negative values to 0, which doesn't happen when I directly use logits because it uses the log-sum-exp trick.

It's also my understanding that the loss can never really go to zero since logits can't be possibly -inf or +inf. So there must be some approximation when you're getting zero loss.

May 16 '19 08:05 KrnTneja

@KrnTneja : Thanks for your tricks. Could you provide any code to do it? I also meet the problem of loss D goes to zero

May 16 '19 15:05 John1231983

@KrnTneja : Thanks for your tricks. Could you provide any code to do it? I also meet the problem of loss D goes to zero

There isn't really any code to show. Just ensure that last layer of your discriminator is not a sigmoid layer i.e. output shouldn't be constrained to [0,1]. I was using PyTorch, where I had to use torch.nn.BCEWithLogitsLoss instead of torch.nn.BCELoss.

May 20 '19 08:05 KrnTneja

hey, have you found any solution of this because I am having the same condition due to which I am not getting any generated image

May 13 '20 15:05 arpita739

Discriminator loss is 0 means the discriminator easily finding the images by the generator. It may happen in some cases like generator leaving checkerboard effects.

Jun 07 '20 10:06 moulicm111

This may also occur total generator loss is sum two losses and generator is trying to minimize the other loss because weighing factor for it is more.

Jun 07 '20 11:06 moulicm111

I face the same problem when I am training a Cyclegan, with torch.sigmoid(D(fake_img)) and GANloss: BCELoss() and finally My G fell into mode failure... Now I try BCELossWithLogits() see what is going on. hope it will work and thank you @KrnTneja !

Oct 02 '21 07:10 DISAPPEARED13

@KrnTneja : Thanks for your tricks. Could you provide any code to do it? I also meet the problem of loss D goes to zero

There isn't really any code to show. Just ensure that last layer of your discriminator is not a sigmoid layer i.e. output shouldn't be constrained to [0,1]. I was using PyTorch, where I had to use torch.nn.BCEWithLogitsLoss instead of torch.nn.BCELoss.

What's the difference between combining BCEWithLogitsLoss with logits outputs and combining BCELoss with sigmoid outputs???

Mar 14 '22 13:03 6xw

@6xw When using BCEWithLogitsLoss, you can utilize the log-sum-exp trick to prevent overflow and thus increase numerical stability.

Apr 21 '22 09:04 Eliacus

@KrnTneja @mateuszkaleta

I have commented the Sigmoid layer in the discriminator and used BCEwithLogitsLoss and the Adam optimizer with a learning rate =0.0001. But still the discriminator loss reaches zero after 30 epochs. Is there anyway to fix that?

Aug 04 '22 15:08 Pravin770

@KrnTneja @mateuszkaleta

I have commented the Sigmoid layer in the discriminator and used BCEwithLogitsLoss and the Adam optimizer with a learning rate =0.0001. But still the discriminator loss reaches zero after 30 epochs. Is there anyway to fix that?

Could you find any solution?

Oct 31 '22 12:10 Raha304

ganhacks ganhacks copied to clipboard

Why Discriminator Loss 0 is a failure mode ?

ganhacks
ganhacks copied to clipboard