TGAN icon indicating copy to clipboard operation
TGAN copied to clipboard

Question about the c_i,j normalisation

Open druzkaya opened this issue 5 years ago • 1 comments

Hello, I have a question regarding the c_i,j normalization + clipping. The clustering of numerical variables in the multimodal case seems really interesting, but why not just estimating the probability to come from every cluster for variable X, then take the maximum probability and generate the X value given the generated cluster. I don't get the intuition behind this point of methodology. Thanks in advance, Best. Aurélia

druzkaya avatar Aug 29 '19 09:08 druzkaya

IIUC, you are suggesting that for continuous column, first generate the cluster id using GAN, and then sample a value from the distribution of such cluster. But the distribution within each cluster is still unknown. So it's necessary to use GAN to model the distribution within cluster.

leix28 avatar Sep 17 '19 03:09 leix28