torchmetrics Add Precision-Recall-Gain curve, Area Under Precision Recall Gain curve, and FGain1 score

🚀 Feature

Add Precision-Recall-Gain (PRG) curve as a new feature with the same interface as the Precision-Recall (PR) curve.

Along with PRG, the Area Under the Precision Recall Gain curve (AUPRG) can be calculated, like is done AveragePrecision.

The FGain1 score (FG1) is the F1 score, but transformed such that it is the minor diagonal in PRG-space. This could be added.

Motivation

The PR curve has some caveats as described in [1]. PRG aims to fix these problems:

baselines are non-universal
interpolation is non-linear
F-isometrics are non-linear
Pareto-front is non-convex
Area under PR curve does not relate to the expected F + there is an unachievable region

In particular, the area under the PR curve is demonstrated to sometimes favour models that result in lower F1-scores. The PRG curve will ultimately result in better model selection.

Pitch

A Torchmetrics implementation of the PRG curve that has the same interface as the PR curve would aid in better model selection.

>>> pred = torch.tensor([0, 0.1, 0.8, 0.4])
>>> target = torch.tensor([0, 1, 1, 0])
>>> prg_curve = PrecisionRecallGainCurve(task="binary")
>>> precision_gain, recall_gain, thresholds = prg_curve(pred, target)
>>> precision_gain
tensor([1.0000, 0.0000, 0.5000, 0.0000])
>>> recall_gain
tensor([0.0000,   0.0000,   1.0000,   1.0000])
>>> thresholds
...

Precision-Gain (PG) and Recall-Gain (RG) can be calculated as

$$ PG = 1 - \frac{tp + fn}{fp + tn} \cdot \frac{fp}{tp}, $$

and

$$ RG = 1 - \frac{tp + fn}{fp + tn} \cdot \frac{fn}{tp}. $$

AUPRG can be calculated as done with AveragePrecision, but only accounting for the area in PR & RG $\in [0, 1]$.

FG1 can be calculated as

$$ FG_1 = \frac{1}{2} PG + \frac{1}{2} RG. $$

It would be even more awesome if PRG can be extended to the multiclass/multilabel case.

Alternatives

The original authors of [1] have developed a package, pyprg (which is out-of-date with dependencies).

pip instal pyprg

Then,

from prg import prg
prg_curve = prg.create_prg_curve(labels=targets, scores=prediction)
precision_gain = prg_curve["precision_gain"]
recall_gain = prg_curve["recall_gain"]
auprg= prg.calc_auprg(prg_curve)

Additional context

[1] Flach & Kull. http://people.cs.bris.ac.uk/~flach/PRGcurves/PRcurves.pdf

May 11 '23 14:05 siemdejong

Hi! thanks for your contribution!, great first issue!

May 11 '23 14:05 github-actions[bot]

Hi @siemdejong, thanks for raising this issue. A couple of questions maybe:

How commonly is this metric used? I have not heard or seen it before in any papers I read.
It is good that there is a package to compare against if we make our own implementation, but then I see issues like https://github.com/meeliskull/prg/issues/7 and wonder how stable the implementation is?

May 12 '23 05:05 SkafteNicki

The metric is not (yet) commonly used. Obvious reasons might be that 1) people simply do not know about it, 2) it takes an extra step to calculate the plot, 3) no good implementation is available.
I have not tested the implementation thoroughly, so I cannot make arguments on the stability of the official implementation.

For another writeup about gain metrics, see https://snorkel.ai/improving-upon-precision-recall-and-f1-with-gain-metrics/

Maybe an interesting discussion on scikit-learn and gain metrics: https://github.com/scikit-learn/scikit-learn/pull/24121

May 12 '23 06:05 siemdejong

Hi, can I contribute in this issue?

May 25 '23 10:05 arijitde92

Hi @arijitde92, Feel free to make a contribution on this topic :)

May 25 '23 12:05 SkafteNicki

torchmetrics torchmetrics copied to clipboard

Add Precision-Recall-Gain curve, Area Under Precision Recall Gain curve, and FGain1 score

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

torchmetrics
torchmetrics copied to clipboard