Rithesh Kumar
Rithesh Kumar
So the issue is that the code is written in a way that doesn't support training using CPU (my bad). You could convert all the `.cuda()` statements in the code...
Hey, This issue seems peculiar to me. Are you saying is_training=True returns the correct answer but is_training=False returns the wrong one?
Right, Sorry. I just noticed in tflib.contrib that batch_norm by default doesn't update moving means / averages. I thought it does by default. Check this https://www.tensorflow.org/api_docs/python/tf/contrib/layers/batch_norm
Of the different divergences that you can use using the f-gan formulation, JSD worked better because it's bounded. KL is unbounded and should not be used to perform any MI...
If you look at the appendix in the MINE paper, it says that maximizing MI between input and output of the GAN generator works only if they perform adaptive gradient...
Well, the JSD or even MI doesn't come out of nowhere. We want to minimize KL(P_G || P_E) since we're training a generator to approximate the energy function for efficient...
I would not like for you to think about GANs (wgan-gp, wgan) in the first place. The objective of this paper is to not find a new GAN. It's to...
Hello Yuntian, As evident from the way the html file looks, it is exported from an ipython notebook. I use the default MathJax renderer in ipython notebook to render the...
1. Could be interesting to try. We went with gradient penalty because it makes theoretically that we want true data to be energy minima, by optimizing for the norm of...
So it is indeed interesting to explore other options instead of gradient penalty to fix the temperature explosion. We couldn't find other ways to prevent it from exploding. You could...