tabular_dae icon indicating copy to clipboard operation
tabular_dae copied to clipboard

Tabular Dataset

Open mortezamg63 opened this issue 3 years ago • 0 comments

Hello, I used the Adult dataset (https://archive.ics.uci.edu/ml/datasets/adult) on the tabular DAE. I used 80% for the train set and 20% for the test set. After training DAE, I use 10% of the train set as labeled data (self-supervised learning). But it self-supervised learning underperforms or performs the same as the supervised learning on just labeled data. The self-supervised's accuracy is 83.x%, which is the same as supervised learning on the labeled data. On some datasets training DAE on all the train sets and using 10% as labeled data for training supervised FNN outperforms the supervised learning (10% of the labeled data), but on other datasets underperforms. Is there something related to the dataset, which I need to consider? For example, something like a type of tabular dataset can cause learning from unlabeled data and improve the accuracy compared to supervised learning?

Or does the problem is related to hyperparameter tuning? I am so confused about this problem. I really appreciate if you can give me some guide and help me on this problem?

Thanks

mortezamg63 avatar Jun 04 '22 17:06 mortezamg63