Transformers-Tutorials icon indicating copy to clipboard operation
Transformers-Tutorials copied to clipboard

Class imbalance for LiLT

Open Altimis opened this issue 2 years ago • 3 comments

Hi @NielsRogge . Thanks again for your work.

I'm facing class imbalance while training LiLt model. The imbalance is at the level of the sub-classes B-, I-, E-, S-. I have around 9 classes and for each class there are 4 sub classes which makes it 36 classes in total. Among these classes, there are classes that are highly represented in the data like (for example) S-CLASS_A and others that are less represented (like I-CLASS_B for example). How to handle this issue please ? Thanks in advance.

Altimis avatar Jun 08 '23 16:06 Altimis

I'd recommend using the weights argument of the cross-entropy loss: https://naadispeaks.blog/2021/07/31/handling-imbalanced-classes-with-weighted-loss-in-pytorch/

NielsRogge avatar Jun 08 '23 20:06 NielsRogge

Thank you @NielsRogge for your response. Wouldn't the 'other' class cause an issue in this weighting ? since it's dominant compared with the other classes.

Altimis avatar Jun 09 '23 00:06 Altimis

You need to set the weights according to the frequencies of the classes, as explained in the blog above

NielsRogge avatar Jun 09 '23 07:06 NielsRogge