setfit How does predict_proba work exactly ?

Hi everyone !

Thanks for this amazing package first ! it is more than useful for a project at my work currently ! and the 0.6.0 was much needed on my side !

BUT i'd like to have some clarifications on how the function predict_proba works because I have a hard understanding.

This table :

score	predicted	pred_proba_0	pred_proba_1
1	1	0.866082	0.133918
1	1	0.762696	0.237304
1	1	0.730971	0.269029
1	1	0.871808	0.128192
1	1	0.671637	0.328363
1	1	0.780433	0.219567
1	1	0.652668	0.347332
1	0	0.767050	0.232950

The score column is the true outcome, predicted is what the predict method gives me when I'm doing inference. pred_proba_0 and pred_proba_1 are given from this code : validate_dataset['pred_proba_0'] = trainer.model.predict_proba(validate_dataset['Fulltext_clean_translated+metadata_clean_translated'].to_list(),as_numpy=True)[:,0] validate_dataset['pred_proba_1'] = trainer.model.predict_proba(validate_dataset['Fulltext_clean_translated+metadata_clean_translated'].to_list(),as_numpy=True)[:,1]

Also when I use this code : model.predict_proba(validate_dataset['Fulltext_clean_translated+metadata_clean_translated'].to_list(),as_numpy=True) I have this output : array([[9.1999289e-07, 9.9999905e-01], [7.2725675e-07, 9.9999928e-01], [8.1967613e-07, 9.9999917e-01], ..., [9.4037086e-06, 9.9999058e-01], [9.1749916e-07, 9.9999905e-01], [1.2628381e-06, 9.9999869e-01]], dtype=float32)

my question is , i'd like to know if the predict_proba output gives (probability to predict 0 , probability to predict 1) ? It doesn't seem like it because of this line :

229	0	1	0.694485	0.305515

something is strange also train.model.predict_proba doesn't give the same result as model.predict_proba... can someone please explain to help me understand ?

Thank you very much !

Feb 10 '23 14:02 doubianimehdi

Hello!

I've been trying to reproduce your findings, but haven't been able to yet. In my quick tests just now, I only get the expected results where the class with a higher probability from predict_proba is also the class that was classified with predict. Also, internally, predict relies on an torch.argmax() of the predict_proba results, so your findings are surprising. Beyond that, train.model.predict_proba should refer to the exact same function on the same class instance as model.predict_proba, so the discrepancy there is also very confusing.

I'm unsure how to further help at this time, as I'm struggling to reproduce it.

Tom Aarsen

Feb 14 '23 12:02 tomaarsen

@doubianimehdi i got the similar results using predict_proba

preds = [[0, 1 ]]
scores = [
  [
    [
      0.9984156383957592,
      0.001584361604240799
    ]
  ],
  [
     [
      0.4313095716935773,
      0.5686904283064227
    ]
  ],
]

How i understand this is 1st value is probability for zero's and 2nd is probability of One's (in my case)

@tomaarsen Multilabel, Multi class model gives probability for both the True and False (0's and 1's). Hope this helps

Nov 15 '23 08:11 miteshkotak

setfit setfit copied to clipboard

How does predict_proba work exactly ?

setfit
setfit copied to clipboard