scikit-learn FEA Confusion matrix derived metrics

Reference Issues/PRs

Continues #17265 Fixes #5516

What does this implement/fix? Explain your changes.

Add metrics.specificity_score and metrics.npv_score, based on metrics.fpr_tpr_fnr_tnr_scores from #17265

Any other comments?

Feb 25 '21 10:02 Pawel-Kranzberg

@jnothman - Hi, will you be in position to review this PR?

Mar 12 '21 07:03 Pawel-Kranzberg

Sorry, I'd forgotten to submit this partial review a few weeks ago

No worries. Please have a look at the current version.

Apr 22 '21 14:04 Pawel-Kranzberg

@jnothman I am wondering if are not duplicating code here. I am thinking that we could have a single private function that could compute all the necessary statistics and then the precision-recall and other metrics would only call this function. I could even think that we could pass a parameter that is a dictionary that could explicitly request some of the metrics to be returned as a bunch.

Here, I am also questioning adding the tpr/fpr/fnr/tnr function since it will have duplicated statistics with the recall score and potentially later the specificity. It seems that @amueller wanted to avoid this in the original issue: https://github.com/scikit-learn/scikit-learn/issues/5516#issuecomment-306464383

Jul 29 '21 17:07 glemaitre

multilabel_confusion_matrix removed a lot of the duplication. Problem is that users want these metrics and for whatever reason find them hard to compute themselves. And given how much revision we've made to P/R/F, there are certainly tricky cases! The present PR could have less duplication; but the main place there is duplication here is in the tests which will require more work to refactor.

I had thought that to avoid clutter, we could avoid putting these measures into metrics, but make them available when a user sought the set of scorers relevant to a particular task vis a vis #12385

Jul 31 '21 12:07 jnothman

multilabel_confusion_matrix removed a lot of the duplication. Problem is that users want these metrics and for whatever reason find them hard to compute themselves. And given how much revision we've made to P/R/F, there are certainly tricky cases! The present PR could have less duplication; but the main place there is duplication here is in the tests which will require more work to refactor.

OK so let's go ahead with this proposal. I will make the first round of review and look afterwards if we can refactor the test.

Sep 08 '21 15:09 glemaitre

@Pawel-Kranzberg Could you solve the conflicts in the files. I assume that the conflicts here are just related to execute black on the file.

Sep 08 '21 15:09 glemaitre

The current metrics.classification_report return

Precision (Positive Predictive Value)
Recall (True Positive Rate)
F1
Support

It is great, but far from complete. The following 2 are very important as well

Negative Predictive Value (NPV)
Specificity (True Negative Rate, or TNR)

The metrics.specificity_score is a great addition because both TNR and NPV need it. Really hope both TNR and NPV can be included in the classification report sometime in the future. The metrics.specificity_score is really the key here.

Once we can cover these 4 metrics for the multiclass classification

Precision (Positive Predictive Value) --> requires (TP, FP)
Recall (True Positive Rate) --> requires (TP, FN)
Negative Predictive Value (NPV) --> requires (TN, FN)
Specificity (True Negative Rate) --> requires (TN, FP)

we can pretty much derive the rest of the metrics, such as

FNR (False Negative Rate = 1 - TPR)
FPR (False Positive Rate = 1 - TNR)
FDR (False Discovery Rate = 1 - PPV)
FOR (False Omission Rate = 1 - NPV)
ACC (Accuracy)
MCC (Mathews Correlation Coefficient)
Prevelance Threshold
...etc

Excellent work! 👍 👍 👍

Sep 09 '21 05:09 Scoodood

@Scoodood I would like to focus this PR only on the metrics addition without touching on the classification report. It could be done in a subsequent PR. Could you open an issue from now on mentioning your feature request regarding the report? I assume that at some point, it could be interesting to have a parameter to filter the desired metrics.

Sep 09 '21 08:09 glemaitre

@Pawel-Kranzberg We could consider adding the NPV as well to close the loop.

Sep 09 '21 08:09 glemaitre

@Scoodood I would like to focus this PR only on the metrics addition without touching on the classification report. It could be done in a subsequent PR. Could you open an issue from now on mentioning your feature request regarding the report? I assume that at some point, it could be interesting to have a parameter to filter the desired metrics.

It's done. https://github.com/scikit-learn/scikit-learn/issues/21000

Sep 09 '21 16:09 Scoodood

@Pawel-Kranzberg Could you solve the conflicts in the files. I assume that the conflicts here are just related to execute black on the file.

Done.

Sep 15 '21 10:09 Pawel-Kranzberg

@Pawel-Kranzberg We could consider adding the NPV as well to close the loop.

Done.

Sep 18 '21 09:09 Pawel-Kranzberg

@glemaitre - Thank you for the comments. The PR is ready for another round of review.

Sep 18 '21 18:09 Pawel-Kranzberg

@Scoodood - FYI, I've added the NPV score fuction (npv_score).

Sep 18 '21 18:09 Pawel-Kranzberg

@Scoodood - FYI, I've added the NPV score fuction (npv_score).

hi @Pawel-Kranzberg, just wondering when these new metrics will be integrated into the official scikit-learn for public consumption?

Nov 10 '21 02:11 Scoodood

@Scoodood - FYI, I've added the NPV score fuction (npv_score).

hi @Pawel-Kranzberg, just wondering when these new metrics will be integrated into the official scikit-learn for public consumption?

I keep my fingers crossed...

Nov 11 '21 21:11 Pawel-Kranzberg

Nice the CI is passing. I will make a new pass on it.

Feb 17 '22 15:02 glemaitre

We won't have time to finish the review on this one before the 1.2 release. Moving it to 1.3

Nov 24 '22 13:11 jeremiedbb

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 4aa4449. Link to the linter CI: here}

Jun 22 '23 13:06 github-actions[bot]

Jan 17 '24 18:01 Pawel-Kranzberg

@lorentzenchr would you be able to maybe have a look?

Apr 16 '24 16:04 adrinjalali

From the long discussion in #5516, better ping @jnothman @amueller @glemaitre .

Apr 16 '24 19:04 lorentzenchr

@Pawel-Kranzberg Could you please stop merging the main branch as long as there are no merge conflict? The reason is that it might hide discussions in Github and also generates notifications. And it should be in your interest that maintainers don't unsubscribe from this PR.

May 01 '24 16:05 lorentzenchr