torchmetrics Wrong Calculation of Mean Average Precision

🐛 Bug

Wrong calculation of mean Average Precision (mAP). I get

{'map': tensor(1.),
 'map_50': tensor(1.),
 'map_75': tensor(-1),
 'map_small': tensor(1.),
 'map_medium': tensor(-1.),
 'map_large': tensor(-1.),
 'mar_1': tensor(1.),
 'mar_10': tensor(1.),
 'mar_100': tensor(1.),
 'mar_small': tensor(1.),
 'mar_medium': tensor(-1.),
 'mar_large': tensor(-1.),
 'map_per_class': tensor([ 1.,  1., -1., -1.]),
 'mar_100_per_class': tensor([ 1.,  1., -1., -1.])}

while I would expect to have 'map': tensor(0.5) and map_per_class': tensor([ 1., 1., 0, 0.]) (see example below)

To Reproduce

Steps to reproduce the behavior...

from torchmetrics.detection.mean_ap import MeanAveragePrecision
metric = MeanAveragePrecision(iou_thresholds=[0.5], class_metrics=True)
preds = [
  dict(
    boxes=torch.Tensor([[0, 0, 20, 20],
                        [30, 30, 50, 50],
                        [70, 70, 90, 90],  # FP
                        [100, 100, 120, 120]]),  # FP
    scores=torch.Tensor([0.6, 0.6, 0.6, 0.6]),
    labels=torch.IntTensor([0, 1, 2, 3]),
  )

]

targets = [
  dict(
    boxes=torch.Tensor([[0, 0, 20, 20],
                        [30, 30, 50, 50]]),
    labels=torch.IntTensor([0, 1]),
  )
]
metric.update(preds, targets)
metric.compute()

Expected behavior

AP per class 2 and 3 must be 0. Thus, the mAP would also become 0.5.

Environment

TM version is 0.9.3 (installed with pip)
Python 3.7.4
Torch 1.10.0+cu111 (installed with pip)
Windows Subsystem for Linux

Aug 15 '22 15:08 RostyslavUA

Hi! thanks for your contribution!, great first issue!

Aug 15 '22 15:08 github-actions[bot]

Any updates on this issue?

Sep 02 '22 13:09 MihailMihaylov97

Why does the above happen?

Precision and Recall are initialized as tensors containing -1s here.
When the code enters these nested for-loops, the static method MeanAveragePrecision.__calculate_recall_precision_scores is called. For class indices 2 and 3, this method does not cause any change in the corresponding locations in the Precision and Recall tensors (precision[:, :, class_idx, :, :] and recall[:, class_idx, :, :], respectively). This happens because this method returns without any change here for the given example. Hence, all entries are still -1s for these classes.
Since all entries are -1s, this call used for calculating mean AP returns a torch.tensor([-1.0]).

What can be done to solve this issue?

In such a case, the static method MeanAveragePrecision.__calculate_recall_precision_scores can be changed to modify precision and recall when there are no GT bboxes. Basically, your example represents an edge case where there cannot be any True Positives (TP), hence, Precision must be zero. Recall will be undefined in this case since the numbers of both TP and FN are zero. Note that, in your example, you had FP for classes 2 and 3. If they were not present, then Precision would also be undefined for these classes.

Hope it helps! 😄

Sep 10 '22 17:09 dhananjaisharma10

What can be done to solve this issue?

@dhananjaisharma10 would you be interested to send a draft PR with a suggested fix? :rabbit:

Oct 19 '22 12:10 Borda

What can be done to solve this issue?

@dhananjaisharma10 would you be interested to send a draft PR with a suggested fix? :rabbit:

@Borda Sure, I can commence work on this.

Oct 19 '22 12:10 dhananjaisharma10

@dhananjaisharma10, how are you doing, can we help you with something? :otter:

Nov 07 '22 11:11 Borda

@Borda I was a little held up in the last 2-3 weeks, though I am relatively free now. Is it okay if I provide an update by this weekend?

EDIT: I have commenced work on it.

Nov 07 '22 15:11 dhananjaisharma10

Is it okay if I provide an update by this weekend?

sure, when ever you have time :)

Nov 07 '22 20:11 Borda

@Borda could you please check the Draft PR and guide further. Thanks.

Nov 09 '22 15:11 dhananjaisharma10

@Borda could you please check the Draft PR and guide further. Thanks.

could you pls open the PR to this repo? :otter:

Nov 09 '22 18:11 Borda

torchmetrics torchmetrics copied to clipboard

Wrong Calculation of Mean Average Precision

🐛 Bug

To Reproduce

Expected behavior

Environment

Why does the above happen?

What can be done to solve this issue?

torchmetrics
torchmetrics copied to clipboard