TGAN
TGAN copied to clipboard
Question about the c_i,j normalisation
Hello, I have a question regarding the c_i,j normalization + clipping. The clustering of numerical variables in the multimodal case seems really interesting, but why not just estimating the probability to come from every cluster for variable X, then take the maximum probability and generate the X value given the generated cluster. I don't get the intuition behind this point of methodology. Thanks in advance, Best. Aurélia
IIUC, you are suggesting that for continuous column, first generate the cluster id using GAN, and then sample a value from the distribution of such cluster. But the distribution within each cluster is still unknown. So it's necessary to use GAN to model the distribution within cluster.