scikit-learn icon indicating copy to clipboard operation
scikit-learn copied to clipboard

FEA Confusion matrix derived metrics

Open Pawel-Kranzberg opened this issue 4 years ago • 23 comments

Reference Issues/PRs

Continues #17265 Fixes #5516

What does this implement/fix? Explain your changes.

Add metrics.specificity_score and metrics.npv_score, based on metrics.fpr_tpr_fnr_tnr_scores from #17265

Any other comments?

Pawel-Kranzberg avatar Feb 25 '21 10:02 Pawel-Kranzberg

@jnothman - Hi, will you be in position to review this PR?

Pawel-Kranzberg avatar Mar 12 '21 07:03 Pawel-Kranzberg

Sorry, I'd forgotten to submit this partial review a few weeks ago

No worries. Please have a look at the current version.

Pawel-Kranzberg avatar Apr 22 '21 14:04 Pawel-Kranzberg

@jnothman I am wondering if are not duplicating code here. I am thinking that we could have a single private function that could compute all the necessary statistics and then the precision-recall and other metrics would only call this function. I could even think that we could pass a parameter that is a dictionary that could explicitly request some of the metrics to be returned as a bunch.

Here, I am also questioning adding the tpr/fpr/fnr/tnr function since it will have duplicated statistics with the recall score and potentially later the specificity. It seems that @amueller wanted to avoid this in the original issue: https://github.com/scikit-learn/scikit-learn/issues/5516#issuecomment-306464383

glemaitre avatar Jul 29 '21 17:07 glemaitre

multilabel_confusion_matrix removed a lot of the duplication. Problem is that users want these metrics and for whatever reason find them hard to compute themselves. And given how much revision we've made to P/R/F, there are certainly tricky cases! The present PR could have less duplication; but the main place there is duplication here is in the tests which will require more work to refactor.

I had thought that to avoid clutter, we could avoid putting these measures into metrics, but make them available when a user sought the set of scorers relevant to a particular task vis a vis #12385

jnothman avatar Jul 31 '21 12:07 jnothman

multilabel_confusion_matrix removed a lot of the duplication. Problem is that users want these metrics and for whatever reason find them hard to compute themselves. And given how much revision we've made to P/R/F, there are certainly tricky cases! The present PR could have less duplication; but the main place there is duplication here is in the tests which will require more work to refactor.

OK so let's go ahead with this proposal. I will make the first round of review and look afterwards if we can refactor the test.

glemaitre avatar Sep 08 '21 15:09 glemaitre

@Pawel-Kranzberg Could you solve the conflicts in the files. I assume that the conflicts here are just related to execute black on the file.

glemaitre avatar Sep 08 '21 15:09 glemaitre

The current metrics.classification_report return

  • Precision (Positive Predictive Value)
  • Recall (True Positive Rate)
  • F1
  • Support

It is great, but far from complete. The following 2 are very important as well

  • Negative Predictive Value (NPV)
  • Specificity (True Negative Rate, or TNR)

The metrics.specificity_score is a great addition because both TNR and NPV need it. Really hope both TNR and NPV can be included in the classification report sometime in the future. The metrics.specificity_score is really the key here.

Once we can cover these 4 metrics for the multiclass classification

  • Precision (Positive Predictive Value) --> requires (TP, FP)
  • Recall (True Positive Rate) --> requires (TP, FN)
  • Negative Predictive Value (NPV) --> requires (TN, FN)
  • Specificity (True Negative Rate) --> requires (TN, FP)

we can pretty much derive the rest of the metrics, such as

  • FNR (False Negative Rate = 1 - TPR)
  • FPR (False Positive Rate = 1 - TNR)
  • FDR (False Discovery Rate = 1 - PPV)
  • FOR (False Omission Rate = 1 - NPV)
  • ACC (Accuracy)
  • MCC (Mathews Correlation Coefficient)
  • Prevelance Threshold
  • ...etc

Excellent work! 👍 👍 👍

Scoodood avatar Sep 09 '21 05:09 Scoodood

@Scoodood I would like to focus this PR only on the metrics addition without touching on the classification report. It could be done in a subsequent PR. Could you open an issue from now on mentioning your feature request regarding the report? I assume that at some point, it could be interesting to have a parameter to filter the desired metrics.

glemaitre avatar Sep 09 '21 08:09 glemaitre

@Pawel-Kranzberg We could consider adding the NPV as well to close the loop.

glemaitre avatar Sep 09 '21 08:09 glemaitre

@Scoodood I would like to focus this PR only on the metrics addition without touching on the classification report. It could be done in a subsequent PR. Could you open an issue from now on mentioning your feature request regarding the report? I assume that at some point, it could be interesting to have a parameter to filter the desired metrics.

It's done. https://github.com/scikit-learn/scikit-learn/issues/21000

Scoodood avatar Sep 09 '21 16:09 Scoodood

@Pawel-Kranzberg Could you solve the conflicts in the files. I assume that the conflicts here are just related to execute black on the file.

Done.

Pawel-Kranzberg avatar Sep 15 '21 10:09 Pawel-Kranzberg

@Pawel-Kranzberg We could consider adding the NPV as well to close the loop.

Done.

Pawel-Kranzberg avatar Sep 18 '21 09:09 Pawel-Kranzberg

@glemaitre - Thank you for the comments. The PR is ready for another round of review.

Pawel-Kranzberg avatar Sep 18 '21 18:09 Pawel-Kranzberg

@Scoodood - FYI, I've added the NPV score fuction (npv_score).

Pawel-Kranzberg avatar Sep 18 '21 18:09 Pawel-Kranzberg

@Scoodood - FYI, I've added the NPV score fuction (npv_score).

hi @Pawel-Kranzberg, just wondering when these new metrics will be integrated into the official scikit-learn for public consumption?

Scoodood avatar Nov 10 '21 02:11 Scoodood

@Scoodood - FYI, I've added the NPV score fuction (npv_score).

hi @Pawel-Kranzberg, just wondering when these new metrics will be integrated into the official scikit-learn for public consumption?

I keep my fingers crossed...

Pawel-Kranzberg avatar Nov 11 '21 21:11 Pawel-Kranzberg

Nice the CI is passing. I will make a new pass on it.

glemaitre avatar Feb 17 '22 15:02 glemaitre

We won't have time to finish the review on this one before the 1.2 release. Moving it to 1.3

jeremiedbb avatar Nov 24 '22 13:11 jeremiedbb

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 4aa4449. Link to the linter CI: here

github-actions[bot] avatar Jun 22 '23 13:06 github-actions[bot]

image

Pawel-Kranzberg avatar Jan 17 '24 18:01 Pawel-Kranzberg

@lorentzenchr would you be able to maybe have a look?

adrinjalali avatar Apr 16 '24 16:04 adrinjalali

From the long discussion in #5516, better ping @jnothman @amueller @glemaitre .

lorentzenchr avatar Apr 16 '24 19:04 lorentzenchr

@Pawel-Kranzberg Could you please stop merging the main branch as long as there are no merge conflict? The reason is that it might hide discussions in Github and also generates notifications. And it should be in your interest that maintainers don't unsubscribe from this PR.

lorentzenchr avatar May 01 '24 16:05 lorentzenchr