Rithesh Kumar comments

Results 16 comments of


                                            Rithesh Kumar

In Windows, even when I pass nothing ("") to set_env.sh, I am getting GPU out of memory!

So the issue is that the code is written in a way that doesn't support training using CPU (my bad). You could convert all the `.cuda()` statements in the code...

is_training=False

Hey, This issue seems peculiar to me. Are you saying is_training=True returns the correct answer but is_training=False returns the wrong one?

Right, Sorry. I just noticed in tflib.contrib that batch_norm by default doesn't update moving means / averages. I thought it does by default. Check this https://www.tensorflow.org/api_docs/python/tf/contrib/layers/batch_norm

why not use KL divergence to estimate mutual information

Of the different divergences that you can use using the f-gan formulation, JSD worked better because it's bounded. KL is unbounded and should not be used to perform any MI...

why not use KL divergence to estimate mutual information

If you look at the appendix in the MINE paper, it says that maximizing MI between input and output of the GAN generator works only if they perform adaptive gradient...

why not use KL divergence to estimate mutual information

Well, the JSD or even MI doesn't come out of nowhere. We want to minimize KL(P_G || P_E) since we're training a generator to approximate the energy function for efficient...

how can we explain GAN works without I(X,Z) term?

I would not like for you to think about GANs (wgan-gp, wgan) in the first place. The objective of this paper is to not find a new GAN. It's to...

The rendering effect

Hello Yuntian, As evident from the way the html file looks, it is exported from an ipython notebook. I use the default MathJax renderer in ipython notebook to render the...

how if we use hinge loss as EnergyModel loss?

1. Could be interesting to try. We went with gradient penalty because it makes theoretically that we want true data to be energy minima, by optimizing for the norm of...

how if we use hinge loss as EnergyModel loss?

So it is indeed interesting to explore other options instead of gradient penalty to fix the temperature explosion. We couldn't find other ways to prevent it from exploding. You could...