WGAN-tensorflow icon indicating copy to clipboard operation
WGAN-tensorflow copied to clipboard

Is the discriminator's output in right scale?

Open LavieC opened this issue 7 years ago • 7 comments

It seems that in the original paper the output of the discriminator(d_loss) is an estimate of EM distance, so should it be positive? The curve of d_loss shows it tends to converge but the negative number seems weird. image

LavieC avatar Apr 16 '17 02:04 LavieC

Please refer to README:

In this implementation, the critic loss is tf.reduce_mean(fake_logit - true_logit), and generator loss is tf.reduce_mean(-fake_logit).

and notice critic_loss's sup form.

So it means here the d_loss is not really EM distance, it is the opposite number of EM distance. Because when we say loss we tend to minimize it, but critic's function meant to maximize the pre-EM distance.

In case you still have doubts about why the minus d_loss, that is, from my above comments, EM-distance growing instead of minimized, I recommend you to read this paper Generalization and Equilibrium in Generative Adversarial Nets (GANs).

Zardinality avatar Apr 16 '17 03:04 Zardinality

Thanks for your reply.

I see the point of minimizing the d_loss, but I still get confused of maximizing EM-distance and minimizing EM-distance. During training we minimize d_loss to obtain an estimate of EM-distance, but EM-distance increases(!) while d_loss decreases. Isn't this conflicts with our ultimate goal of minimizing the EM-distance?

LavieC avatar Apr 18 '17 03:04 LavieC

You mentioned

During training we minimize d_loss to obtain an estimate of EM-distance.

That is true, by minimizing d_loss, we obtain an estimate of EM-distance for current generator, which has been improved according to previous critic, so the EM-distance for current generator and previous critic should be lower than we expected, shouldn't it?( Note here the EM-distance is not a real EM-distance, it's a outdated EM-distance.) Thus updating critic according to current generator will lead us to a higher EM-distance. But in theory the EM-distance will eventually go down, with both effects from updating critic and generator. However, in reality, because we only do finite sampling to estimate the distance, it does not go that way. You might want to refer to the above paper for further information.

Zardinality avatar Apr 18 '17 05:04 Zardinality

But in theory the EM-distance will eventually go down, with both effects from updating critic and generator.

Like in the raw GAN ? This sort of curves are also what I expect image

But i think the most confused thing is that in Wasserstein GAN they give a smooth training curve which consistently decreases till convergence.

image

LavieC avatar Apr 18 '17 06:04 LavieC

Yes. Though I believe they did some smooth work on this curve, a decreasing curve is guaranteed by theory. Notice they ran for half billion step, maybe you want to run more epochs to see if the estimated EM-distance will decrease.(I personally don't care whether it decrease, as long as it does not converge to 0 when sample quality is bad.)

Zardinality avatar Apr 18 '17 07:04 Zardinality

This helps a lot. Thanks!

LavieC avatar Apr 18 '17 11:04 LavieC

Hi,I have a question about the c_loss.when I run my code,the absolute value of negative c_loss will always increase. But I adjust the learning rate of model so that the c_loss is positive,it decrease. Is this normal?Because I see that others code is negative, and as the number of training increases, the absolute value of the negative c_loss decreases. image

there are my code: image

Losstie avatar Feb 22 '19 02:02 Losstie