setfit icon indicating copy to clipboard operation
setfit copied to clipboard

more metrics addition (i.e f1score, precision ) in the trainer.evaluate()

Open snayan06 opened this issue 2 years ago • 5 comments

was just checking the code and saw only accuracy as a metrics, are we planning to add more metrics?

snayan06 avatar Oct 18 '22 11:10 snayan06

Hi @snayan06 thanks for your interest in setfit! Although accuracy is the default metric, you can specify a different one when you create the SetFitTrainer, e.g.:

from setfit import SetFitTrainer

trainer = SetFitTrainer(model=model, train_dataset=train_dataset, metric="f1")

Currently, we support any classification metric from the evaluate library: https://huggingface.co/docs/evaluate/index

lewtun avatar Oct 18 '22 12:10 lewtun

ok thanks for answering , was just looking to contribute and thought that was not there in setfit, and would have been helpful to add. thanks to pointing to the the library as well.

snayan06 avatar Oct 18 '22 13:10 snayan06

Is there any way u guys track features and people can contribute to build these features, or it is just in the issues , new to open source but have been using huggingface/sbert models from a long time and thought if i can contribute in some way that would be good.

snayan06 avatar Oct 18 '22 13:10 snayan06

What if I have multi-label classification problem + I want to evaluate my eval dataset via "f1" + specific average strategy like 'weighted' or 'micro' ? Is it supported ?

kamilc-bst avatar Oct 18 '22 20:10 kamilc-bst

Currently, we support any classification metric from the evaluate library: https://huggingface.co/docs/evaluate/index

@lewtun one limitation though seems to be that in case of non binary labels there is no way to pass an average strategy. this would need to happen here https://github.com/huggingface/setfit/blob/a06be0efae1f71357aa357af8df42175f6adc91e/src/setfit/trainer.py#L303

So in those cases you need to fallback to accuracy or write your own trainer. Would it be in scope to support those evaluations? I am happy to open a PR for this.

nsorros avatar Oct 26 '22 12:10 nsorros

shouldn't this line in the same file (and method) handle multi-label though?

https://github.com/huggingface/setfit/blob/a06be0efae1f71357aa357af8df42175f6adc91e/src/setfit/trainer.py#L295

jmwoloso avatar Dec 01 '22 18:12 jmwoloso

assuming I'm not missing something major, this seems like an easy fix. here's my current implementation that I'm testing. happy to start a formal PR for this if desired @lewtun

https://github.com/huggingface/setfit/compare/main...jmwoloso:setfit:fix_multilabel_metrics?expand=1

EDIT: haven't added tests or anything which i'd need to do, etc. to make this a formal PR which i'm happy to do; this is just to unblock me from using SetFitTrainer at the moment

jmwoloso avatar Dec 02 '22 21:12 jmwoloso

I was wondering if there have been any updates as regards computing other metrics for multi label output?

joostjansen avatar Jun 13 '23 11:06 joostjansen

You can use "accuracy" or "f1" (now fully supported for multi-label by also using metric_kwargs) as simple low-effort solutions to evaluation, but you can also provide a function if you'd like. See this for more information:

https://github.com/huggingface/setfit/blob/b503c0b80b1fce800380e8e9092ab0ad14ae8889/src/setfit/trainer.py#L43-L48

I think this should allow for as much flexibility as is required, so I'll close this.

  • Tom Aarsen

tomaarsen avatar Jun 26 '23 14:06 tomaarsen