setfit From which number of training samples does it not make sense anymore to use SetFit?

From which number of training samples does it not make sense anymore to use SetFit?

Open lbelpaire opened this issue 2 years ago • 1 comments

I'm building a classifier that assigns news articles to one of 8 categories, I was wondering if there was a rule of thumb that over a certain number of training samples per class it would make more sense to use a traditional transformer classifier such as roberta-large? Or will SetFit always be more accurate?

Jul 25 '23 06:07 lbelpaire

I was wondering if there was a rule of thumb that over a certain number of training samples per class it would make more sense to use a traditional transformer classifier such as roberta-large?

That is a good question. I would bet that this depends on the dataset when it comes to accuracy. Another perspective is whether you want to construct a very large contrastive dataset with SetFit when you already have a lot of examples to begin with; if you use n data points, the contrastive set will contain O(n*(n-1)/2).

Or will SetFit always be more accurate?

No, it's not always more accurate; see for example https://huggingface.co/blog/setfit

In the blog, Roberta-Large with all examples is better than SetFit with up to 60 examples per class.

Jul 25 '23 09:07 kgourgou

setfit setfit copied to clipboard

From which number of training samples does it not make sense anymore to use SetFit?

setfit
setfit copied to clipboard