Contrastive-Clustering icon indicating copy to clipboard operation
Contrastive-Clustering copied to clipboard

Imbalanced Dataset

Open millanp95 opened this issue 2 years ago • 1 comments

Hi,

Thank you for this implementation. It is my understanding that some contrastive frameworks build upon entropy maximization, which leads to inapplicability in the contexts of imbalanced datasets. From the paper, I could see that you are also maximizing the entropy in your loss function. Can the instance-level term mitigate the entropy maximization issue and make the method suitable for imbalanced datasets?

Thanks

millanp95 avatar Nov 30 '21 20:11 millanp95

Yes, the instance-level contrastive learning is not sensitive to imbalance datasets. In fact we have tested our method on some imbalanced datasets by simply removing the entropy maximization term on the cluster-level contrastive head and it gives reasonable results instead of trivial solution. Perhaps you could try to add a smaller weights on the entropy maximization term and strengthen the instance-level term just like you said.

Yunfan-Li avatar Dec 01 '21 01:12 Yunfan-Li