tabular_dae Losses quickly converge to zero

Have you met the problem that the loss quickly converges to zero in two epochs even with very large swap noise (>0.5) or dropout? Meanwhile, the transformed features do not contain useful informations. I am not sure if this is the problem caused by the dataset or not...

Apr 19 '21 23:04 YirunKCL

Hi, can you describe a bit more about the characteristics of the dataset? I found a bug in the code that handles categorical data embedding. If your dataset contains mostly categoricals and using embeddings, it might be the issue.

Apr 27 '21 12:04 ryancheunggit

It is weird. My dataset has only continuous features. Even when I use hidden unit = 1, the loss can go down to almost zero...Looks like there is some leakage either in the model or in my data.

Apr 27 '21 14:04 YirunKCL

Ok, thanks for your details. I guess you can try holdout a column and use all the rest to predict it with some simple models and see if you can predict holdouts really well.

Apr 27 '21 14:04 ryancheunggit