supervision
supervision copied to clipboard
Object detection precision-ap
Search before asking
- [X] I have searched the Supervision issues and found no similar feature requests.
Question
Suppose we have a single class object detection problem. Shouldn't the average precision and the precision metrics across the different iou thresholds be the same? This does not seem to be the case here.
Additional
No response
I'll have a look. Thank you for the report, @GiannisApost
Great question, @GiannisApost! This touches on a fundamental concept in object detection evaluation metrics. Let me clarify the relationship between Average Precision (AP) and Precision across different IoU thresholds.
Key Differences:
Average Precision (AP) and Precision are fundamentally different metrics:
1. Precision at Single IoU Threshold
- Definition: TP / (TP + FP) at a specific IoU threshold (e.g., 0.5)
- Single Value: Gives one precision score for that threshold
- Threshold Dependent: Changes significantly with IoU threshold
2. Average Precision (AP)
- Definition: Area under the Precision-Recall curve
- Multiple Evaluations: Computed by varying confidence thresholds (not IoU)
- Fixed IoU: Calculated at a single, fixed IoU threshold
- Comprehensive: Captures performance across all confidence levels
Why They're Different:
-
Different Dimensions:
- Precision: Varies with IoU threshold (stricter IoU → lower precision)
- AP: Varies with confidence threshold at fixed IoU
-
Mathematical Relationship:
[email protected] = ∫ Precision(Recall) dRecall # at IoU=0.5 [email protected] = TP/(TP+FP) # at IoU=0.5, single confidence -
Practical Example:
- At IoU=0.5: AP might be 0.85 (good across all confidences)
- At IoU=0.7: AP might be 0.65 (stricter matching)
- But [email protected] at conf=0.5 might be 0.92 (specific point)
Expected Behavior:
AP should decrease as IoU threshold increases because:
- Stricter IoU requirements reduce TP count
- More detections become FPs
- This is the standard COCO evaluation pattern: [email protected] > [email protected]
If you're seeing unexpected behavior, I'd be happy to help debug the specific metrics computation!
Best regards, Gabriel