torchmetrics
torchmetrics copied to clipboard
Wrong Calculation of Mean Average Precision
🐛 Bug
Wrong calculation of mean Average Precision (mAP). I get
{'map': tensor(1.),
'map_50': tensor(1.),
'map_75': tensor(-1),
'map_small': tensor(1.),
'map_medium': tensor(-1.),
'map_large': tensor(-1.),
'mar_1': tensor(1.),
'mar_10': tensor(1.),
'mar_100': tensor(1.),
'mar_small': tensor(1.),
'mar_medium': tensor(-1.),
'mar_large': tensor(-1.),
'map_per_class': tensor([ 1., 1., -1., -1.]),
'mar_100_per_class': tensor([ 1., 1., -1., -1.])}
while I would expect to have 'map': tensor(0.5)
and map_per_class': tensor([ 1., 1., 0, 0.])
(see example below)
To Reproduce
Steps to reproduce the behavior...
from torchmetrics.detection.mean_ap import MeanAveragePrecision
metric = MeanAveragePrecision(iou_thresholds=[0.5], class_metrics=True)
preds = [
dict(
boxes=torch.Tensor([[0, 0, 20, 20],
[30, 30, 50, 50],
[70, 70, 90, 90], # FP
[100, 100, 120, 120]]), # FP
scores=torch.Tensor([0.6, 0.6, 0.6, 0.6]),
labels=torch.IntTensor([0, 1, 2, 3]),
)
]
targets = [
dict(
boxes=torch.Tensor([[0, 0, 20, 20],
[30, 30, 50, 50]]),
labels=torch.IntTensor([0, 1]),
)
]
metric.update(preds, targets)
metric.compute()
Expected behavior
AP per class 2 and 3 must be 0. Thus, the mAP would also become 0.5.
Environment
- TM version is 0.9.3 (installed with pip)
- Python 3.7.4
- Torch 1.10.0+cu111 (installed with pip)
- Windows Subsystem for Linux
Hi! thanks for your contribution!, great first issue!
Any updates on this issue?
Why does the above happen?
- Precision and Recall are initialized as tensors containing
-1
s here. - When the code enters these nested for-loops, the static method
MeanAveragePrecision.__calculate_recall_precision_scores
is called. For class indices2
and3
, this method does not cause any change in the corresponding locations in the Precision and Recall tensors (precision[:, :, class_idx, :, :]
andrecall[:, class_idx, :, :]
, respectively). This happens because this method returns without any change here for the given example. Hence, all entries are still-1
s for these classes. - Since all entries are
-1
s, this call used for calculating mean AP returns atorch.tensor([-1.0])
.
What can be done to solve this issue?
In such a case, the static method MeanAveragePrecision.__calculate_recall_precision_scores
can be changed to modify precision
and recall
when there are no GT bboxes. Basically, your example represents an edge case where there cannot be any True Positives (TP), hence, Precision must be zero. Recall will be undefined in this case since the numbers of both TP and FN are zero. Note that, in your example, you had FP for classes 2
and 3
. If they were not present, then Precision would also be undefined for these classes.
Hope it helps! 😄
What can be done to solve this issue?
@dhananjaisharma10 would you be interested to send a draft PR with a suggested fix? :rabbit:
What can be done to solve this issue?
@dhananjaisharma10 would you be interested to send a draft PR with a suggested fix? :rabbit:
@Borda Sure, I can commence work on this.
@dhananjaisharma10, how are you doing, can we help you with something? :otter:
@Borda I was a little held up in the last 2-3 weeks, though I am relatively free now. Is it okay if I provide an update by this weekend?
EDIT: I have commenced work on it.
Is it okay if I provide an update by this weekend?
sure, when ever you have time :)
@Borda could you please check the Draft PR and guide further. Thanks.
@Borda could you please check the Draft PR and guide further. Thanks.
could you pls open the PR to this repo? :otter: