setfit icon indicating copy to clipboard operation
setfit copied to clipboard

How many samples for setfit?

Open hanshupe opened this issue 2 years ago • 3 comments

I understood that setfit is a light weight solution for few shot learning. Two questions came up: .) What would be a number of samples of class you would switch to standard supervised learning and fine-tuning? E.g. 100 samples? .) Is there any disadvantage of generating too many pairs (num_iterations) If I have 30 classes, wouldnt be the default of 20 too small to learn meaningful embeddings?

hanshupe avatar Oct 20 '22 06:10 hanshupe

in my experiment(50 classes and each class used 20 to 50 examples), setfit accuracy is 0.857

RoacherM avatar Oct 21 '22 05:10 RoacherM

Did you compare it to supervised learning with fine-tuning?

hanshupe avatar Oct 21 '22 06:10 hanshupe

For the number samples to switch between few shots and standard supervised learning and fine-tuning part, It would be subjective SETfit performance is slightly less then supervised learning and fine-tuning part with lot of examples.

I prefer SETfit where there is no/ low number of training data. (16/ 32 samples per class.). If one has a way out to do labelling should go ahead and use standard supervised learning and fine-tuning. Ultimately its a trade-off game. But SETfit does gives very good start with few samples.

For num_iterations, I try increasing it till the point i get performance gain. (Treat it as hyper-parameter.)

mrtushartiwari avatar Feb 27 '23 10:02 mrtushartiwari