Gumbel_Softmax_VAE icon indicating copy to clipboard operation
Gumbel_Softmax_VAE copied to clipboard

latent dim

Open yzhou359 opened this issue 6 years ago • 2 comments
trafficstars

Hi, what does the latent_dim mean in your code? Could it be changed to other numbers? I can understand that categorical_dim means 10 categories for 10 digits, but I'm confused about the latent_dim. Thanks!

yzhou359 avatar Feb 14 '19 20:02 yzhou359

Same question here; it would be great if someone can shed lights on latent_dim, which is N in the author's notebook https://github.com/ericjang/gumbel-softmax/blob/master/Categorical%20VAE.ipynb.

Why do we need latent_dim (or number of categorical distributions as in author's notebook), making the fully-connected layer output categorical_dim * latent_dim instead of just categorical_dim?

yjlolo avatar Dec 29 '19 09:12 yjlolo

I think latent_dim represents how many categorical variables there is in the model, while categorical_dim denotes the number of categories in each latent categorical variable. This is why the "true" dimensionality of the encoder output and the decoder input is 300 (30 variables x 10 categories for each var) in this model.

The misinterpretation stems from the assumption that 10 categories of the categorical latent space represents 10 digits, but this is not necessarily the case, because there are is a lot of variation in the data in addition to the digit type (azimuth, width, thickness) which is why the model needs more than just 10 categories in the latent space.

gokceneraslan avatar Nov 21 '20 05:11 gokceneraslan