mljar-supervised icon indicating copy to clipboard operation
mljar-supervised copied to clipboard

Compute metrics with selected threshold

Open pplonski opened this issue 4 years ago • 9 comments

Based on discussion in https://github.com/mljar/mljar-supervised/discussions/418 it will be helpful to compute metrics with the same threshold value.

pplonski avatar Jun 25 '21 07:06 pplonski

Hi, is the issue still open? I would like to work on this feature. I'm new to the open-source world, so please help me with the info.

neilmehta31 avatar Sep 08 '21 08:09 neilmehta31

Hi @neilmehta31, great that you would like to help with this issue. Yes, it is still open.

The problem is that currently there are reported many metrics in the model README.md but each has computed its own threshold value. Maybe we can add one more table with metrics with the same threshold for all metrics (I think we should threshold that is computed for eval_metric or from Accuracy).

@neilmehta31 please ask if you have any questions.

pplonski avatar Sep 08 '21 09:09 pplonski

Thanks, @pplonski, for the reply. There are multiple algorithms used, and each has a README.md, so a table in each of the algorithms regarding the same threshold for all metrics is to be added, right? I will start the work right away.

neilmehta31 avatar Sep 09 '21 09:09 neilmehta31

@neilmehta31 yes, in each README.md there should be additional table.

pplonski avatar Sep 13 '21 06:09 pplonski

Hey @pplonski, I have made the necessary changes and added the table for the additional metric with threshold as the value computed for Accuracy. Would you please tell me if any other thing is to be added/edited in the table? I am attaching one for your reference. If everything seems fine, I will make a PR.

Metric details

score threshold
logloss 0.33148 nan
auc 0.903267 nan
f1 0.688375 0.35753
accuracy 0.846847 0.4854
precision 0.976471 0.974946
recall 1 3.88177e-06
mcc 0.58214 0.35753

Metric details with same threshold value (Accuracy threshold)

score threshold
logloss 0.33148 nan
auc 0.903267 nan
f1 0.67546 0.4854
accuracy 0.846847 0.4854
precision 0.689582 0.4854
recall 0.661905 0.4854
mcc 0.575497 0.4854

Confusion matrix (at threshold=0.4854)

Predicted as <=50K Predicted as >50K
Labeled as <=50K 4197 438
Labeled as >50K 497 973

neilmehta31 avatar Sep 28 '21 16:09 neilmehta31

@neilmehta31 looks very good! :+1:

Maybe I will change the title 'Metric details with same threshold value (Accuracy threshold)' to:

  • 'Metric details with threshold from accuracy metric' or
  • 'Metric details with threshold=0.4854'

@neilmehta31 please select the title version. I'm waiting for PR from you.

pplonski avatar Sep 29 '21 06:09 pplonski

@pplonski Could you please tell me what do you meant by selecting the title version? Excited to contribute to the repo!!

neilmehta31 avatar Sep 29 '21 08:09 neilmehta31

Which one will be better: 'Metric details with threshold from accuracy metric' or 'Metric details with threshold=0.4854' for the title over the table?

pplonski avatar Sep 29 '21 08:09 pplonski

I think 'Metric details with threshold from accuracy metric' world be better as the threshold value can anyways be seen in the table. What do you say? I'll update the same before the PR

neilmehta31 avatar Sep 29 '21 08:09 neilmehta31