deligan Latent Space

Latent Space

Open Bob-RUC opened this issue 5 years ago • 5 comments

Hi, I noticed the paper said the trained latent space is a mixed Gaussian distribution with trainable variance and expectation:

In particular, we propose a reparameterization of the latent space as a Mixture- of-Gaussians model.

However, it seems that in the script the latent space applied here is a uniform distribution with trainable variance and expectation: display_z = np.random.uniform(-1.0, 1.0, [batchsize, z_dim]).astype(np.float32) I don't quite understand this inconsistency.

Jul 18 '19 02:07 Bob-RUC

Hi Bob,

thanks for pointing out the issue. The distribution used for training in mnist is actually defined in the following line. It is indeed the standard normal distribution : https://github.com/val-iisc/deligan/blob/68451c8923650b9239a87efb3b88f04b6969e54b/src/mnist/dg_mnist.py#L190 The line you pointed out is actually initializing the variable for evaluation and that's most probably a bug in our code. I think that bug was a result of some experiments that we were doing after submission.. but the results in the paper correspond to the case where display_z was sampled from the standard normal distribution as well. We will correct this bug in the repository soon.

However, the codes for the other datasets (cifar-10 and sketches) don't have that bug. Feel free to use them as is.

Thanks for your interest in our paper.

Jul 21 '19 03:07 swami1995

Thank you very much for your reply. I've fixed it as you said and it worked well as the paper had presented. However I have one more question concerning the optimization codes. I noticed that two parameters, t1 and thres are used to control the range of generator loss, where t1 is used to control thres and thres directly controls generator loss. I found it a particularly delicate controlling method for GAN but I can't figure out how it was developed to fit the model. Could you please give my some tuition on this issue?

Jul 22 '19 02:07 Bob-RUC

Hi Bob,

I essentially used those variables to provide a curriculum during training. thres was used to decide whether to update the generator v/s the discriminator. This was decided based on the generator loss. Simultaneously, the value of thres was increased/decreased after each iteration of generator/discriminator to ensure that one of them doesn't get overtrained. t1 was just a constant that provided a lower bound for thres and was heuristically chosen.

Hope that helped with some of the intuition. However I would not recommend using these heursitics. You'd be better off using the more modern GAN frameworks to stabilize training as opposed to relying on these heuristics.

Jul 24 '19 23:07 swami1995

I think I generally has grasped your intuition. Thank you very much for helping me figure out what's happening here!

Jul 29 '19 02:07 Bob-RUC

I am new in Tensorflow. While I was running the toy dataset code, I got this error "ValueError: Variable g_z already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope?" how do I fixed it?

Sep 24 '19 17:09 TanmDL

deligan deligan copied to clipboard

Latent Space

deligan
deligan copied to clipboard