deligan icon indicating copy to clipboard operation
deligan copied to clipboard

Latent Space

Open Bob-RUC opened this issue 5 years ago • 5 comments

Hi, I noticed the paper said the trained latent space is a mixed Gaussian distribution with trainable variance and expectation:

In particular, we propose a reparameterization of the latent space as a Mixture- of-Gaussians model.

However, it seems that in the script the latent space applied here is a uniform distribution with trainable variance and expectation: display_z = np.random.uniform(-1.0, 1.0, [batchsize, z_dim]).astype(np.float32) I don't quite understand this inconsistency.

Bob-RUC avatar Jul 18 '19 02:07 Bob-RUC

Hi Bob,

thanks for pointing out the issue. The distribution used for training in mnist is actually defined in the following line. It is indeed the standard normal distribution : https://github.com/val-iisc/deligan/blob/68451c8923650b9239a87efb3b88f04b6969e54b/src/mnist/dg_mnist.py#L190 The line you pointed out is actually initializing the variable for evaluation and that's most probably a bug in our code. I think that bug was a result of some experiments that we were doing after submission.. but the results in the paper correspond to the case where display_z was sampled from the standard normal distribution as well. We will correct this bug in the repository soon.

However, the codes for the other datasets (cifar-10 and sketches) don't have that bug. Feel free to use them as is.

Thanks for your interest in our paper.

swami1995 avatar Jul 21 '19 03:07 swami1995

Thank you very much for your reply. I've fixed it as you said and it worked well as the paper had presented. However I have one more question concerning the optimization codes. I noticed that two parameters, t1 and thres are used to control the range of generator loss, where t1 is used to control thres and thres directly controls generator loss. I found it a particularly delicate controlling method for GAN but I can't figure out how it was developed to fit the model. Could you please give my some tuition on this issue?

Bob-RUC avatar Jul 22 '19 02:07 Bob-RUC

Hi Bob,

I essentially used those variables to provide a curriculum during training. thres was used to decide whether to update the generator v/s the discriminator. This was decided based on the generator loss. Simultaneously, the value of thres was increased/decreased after each iteration of generator/discriminator to ensure that one of them doesn't get overtrained. t1 was just a constant that provided a lower bound for thres and was heuristically chosen.

Hope that helped with some of the intuition. However I would not recommend using these heursitics. You'd be better off using the more modern GAN frameworks to stabilize training as opposed to relying on these heuristics.

swami1995 avatar Jul 24 '19 23:07 swami1995

I think I generally has grasped your intuition. Thank you very much for helping me figure out what's happening here!

Bob-RUC avatar Jul 29 '19 02:07 Bob-RUC

I am new in Tensorflow. While I was running the toy dataset code, I got this error "ValueError: Variable g_z already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope?" how do I fixed it?

TanmDL avatar Sep 24 '19 17:09 TanmDL