continuum icon indicating copy to clipboard operation
continuum copied to clipboard

Unbalanced classes

Open arthurdouillard opened this issue 3 years ago • 0 comments

From @umbertocappellazzo:

Well, the split into train, test and valid has been made by the authors who created the corpus and I don't know whether they crafted then different sets. Since I'm the first to use FSC in a CL scenario, I think it could be ok to proceed in this way, and I understand your rigorousness for this matter. So, you have the last word about this. I take advantage of this thread for asking one thing: does Continuum handle the case of unbalanced classes for rehearsal? I had a look at the I suppose not, but I wanna be sure. If the dataset contains unbalanced classes, it's not fair to keep the same # of samples for each class. If continnum doesn't cover this case, I can come up with a solution for my project and then I can make a PR (if you think this is worth it).

I'd see two solutions:

  • either use a sampler given to the data loader to {over,under}-sample classes
  • or use a custom RehearsalMemory where you'd allow sampling a different amount of samples per class (not sure this very particular case is worth adding to Continuum though)

arthurdouillard avatar Jun 16 '22 09:06 arthurdouillard