How the evaluation works?

Open ch3cook-fdu opened this issue 2 years ago • 0 comments

I notice that the deinition of ref_acc in (line 89, lib/eval_helper.py) calculates whether the selected bounding box matches the prediction box with maximum iou with the target box.

However, in my understanding, the expected output of 3D visual grounding is to generate only one bounding box with repect to the input scene and language query. Thus, this metric is only an intermediate evaluation rather than the final evaluation?

Mar 07 '23 07:03 ch3cook-fdu