torchmetrics icon indicating copy to clipboard operation
torchmetrics copied to clipboard

Some metrics are handling absent values ​​incorrectly

Open faber6911 opened this issue 3 years ago • 1 comments

🐛 Bug

Some metrics such as Accuracy, Precision, Recall and F1Score are handling the absent values incorrectly. A value absent in target and pred, and therefore correctly predicted, is considered incorrect.

To Reproduce

Steps to reproduce the behavior...

  • Initialize the metric with num_classes = 2, mdmc_average = "samplewise" and average = "none";
  • create a batch of 2 elements predicted correctly but in one of the two elements of the batch one of the classes must be absent.

Code sample

import torch
from torchmetrics import Accuracy

target = torch.tensor(
    [
        [0,0,0,0],
        [0,0,1,1],
    ]
)
preds = torch.tensor(
    [
        [0,0,0,0],
        [0,0,1,1],
    ]
)

acc = Accuracy(num_classes=2, average="none", mdmc_average="samplewise")
actual_result = acc(pred, target)

expected_result = torch.tensor([1., 1.])
assert torch.equal(expected_result, actual_result)

Expected behavior

The result should be torch.tensor([1., 1.]) because the two classes are predicted correctly for both elements of the batch. In fact, in the first element of the batch the absence of class 1 is expected by the target tensor. Despite this the result of the metric is torch.tensor([1., .5]), because in the first element of the batch the value of the metric for class 1 is 0.0

Environment

  • OS (e.g., Linux): Linux
  • Python & PyTorch Version (e.g., 1.0): 3.8.12 & 1.11
  • How you installed PyTorch (conda, pip, build command if you used source): pip
  • Any other relevant information:

Additional context

The metric JaccardIndex provide the argument absent_score to handle such cases.

faber6911 avatar May 08 '22 10:05 faber6911

Hi! thanks for your contribution!, great first issue!

github-actions[bot] avatar May 08 '22 10:05 github-actions[bot]

Issue will be fixed by classification refactor: see this issue https://github.com/Lightning-AI/metrics/issues/1001 and this PR https://github.com/Lightning-AI/metrics/pull/1195 for all changes

Small recap: This issue describe that accuracy metric is not computing the right value in the binary setting. The problem with the current implementation is that the metric are calculated as average over the 0 and 1 class, which is wrong.

After the refactor this has been fixed. Using the new binary_* version of the metric on the provided example:

from torchmetrics.functional import binary_accuracy
import torch
target = torch.tensor(
    [
        [0,0,0,0],
        [0,0,1,1],
    ]
)
preds = torch.tensor(
    [
        [0,0,0,0],
        [0,0,1,1],
    ]
)
binary_accuracy(preds, target, multidim_average="samplewise")  # tensor([1., 1.])

which give the correct result. Issue will be closed when https://github.com/Lightning-AI/metrics/pull/1195 is merged.

SkafteNicki avatar Aug 30 '22 14:08 SkafteNicki