waymo-open-dataset Understanding evaluation results for L1 difficulty

Understanding evaluation results for L1 difficulty

Open xiangruhuang opened this issue 2 years ago • 1 comments

I'm trying to understand how evaluation are done for L1 difficulty. Essentially L1 difficulty selects a subset of GT boxes. In this case, I'm not sure how are precision evaluated. Specifically, I wonder if the predicted boxes are filtered/selected accordingly during evaluation. Intuitively, since some predicted boxes are predictions for L2 level GT boxes, does it make sense to ignore some of them when evaluating L1 difficulty metrics?

I tried searching via google but no one has mentioned this detail. Any links/answers related to this question is appreciated. Thanks!

Apr 16 '22 15:04 xiangruhuang

Hi,

When calculating L1 metrics, we do not treat any predictions that predict L2 ground truths as false positives. Please refer to https://github.com/waymo-research/waymo-open-dataset/blob/master/waymo_open_dataset/metrics/detection_metrics.cc#L93-L95 for the implementation details.

Best, Wayne, on behalf of the Waymo Open Dataset team

Apr 28 '22 00:04 hfslyc

waymo-open-dataset waymo-open-dataset copied to clipboard

Understanding evaluation results for L1 difficulty

waymo-open-dataset
waymo-open-dataset copied to clipboard