What is the most time-consuming step in predicting the pose of an object in an image?

Open PanJiangSCU opened this issue 3 years ago • 2 comments

Dear Liu Yuan:

I want to know what is the most time-consuming step in predicting the pose of an object in an image so tahat I can do some applications. Thank you!

Nov 16 '22 01:11 PanJiangSCU

I think in the paper it was described to be the pose refinement.

Running time. To process an image of size 540×960, Gen6D estimator costs ∼0.64 second in total on a 2080Ti GPU, in which the object detector costs ∼0.1 second, the viewpoint selector costs ∼0.04 second and the refiner with 3 times refinement costs ∼0.5 second.

Nov 16 '22 04:11 EternalGoldenBraid

Yes, the most time-consuming is the refiner, because it involves a 3D CNN which is relatively slower. Moreover, I only use the detector and the selector for initialization, which only requires running once. Afterward, only the refiner is iteratively applied. To improve efficiency, the key is to improve the refiner.

Nov 16 '22 10:11 liuyuan-pal