PlaneSweepPose icon indicating copy to clipboard operation
PlaneSweepPose copied to clipboard

The performance in the new scene

Open cv1995 opened this issue 3 years ago • 7 comments

In a new scene (no GT data, only camera parameters), is it possible to get accurate 3D poses by directly using the pretrained model you provided?

cv1995 avatar Jun 29 '21 07:06 cv1995

Hi,

The pretrained model should work on new scenes as long as the range of depth is set correctly to cover the scene of interest.

jiahaoLjh avatar Jul 02 '21 06:07 jiahaoLjh

Hi,

The pretrained model should work on new scenes as long as the range of depth is set correctly to cover the scene of interest.

Thank you for your reply. How can I get the accurate depth range in the new scene? I use the synthesized 3D pose to calculate the approximate depth range according to our camera parameters and modify the POSE_MIN_ DEPTH and POSE_ MAX_ DEPTH(0.0-7000.0), but when using the Shelf pretrained model during inference, the following error results will appear(the 2D poses is the result of reprojection). What is the reason?

image

cv1995 avatar Jul 02 '21 08:07 cv1995

You don't have to get the exact depth range. An approximate range that roughly covers the whole scene will work.

From you visualization, it seems the depth estimation is not correct. Can you visualize the following output data and check what is the depth distribution the network produces?

  • pose_score_volume
  • pose_depth_volume
  • joint_score_volume
  • joint_depth_volume

jiahaoLjh avatar Jul 04 '21 07:07 jiahaoLjh

You don't have to get the exact depth range. An approximate range that roughly covers the whole scene will work.

From you visualization, it seems the depth estimation is not correct. Can you visualize the following output data and check what is the depth distribution the network produces?

  • pose_score_volume
  • pose_depth_volume
  • joint_score_volume
  • joint_depth_volume

Can you explain in detail how to determine the depth range based on these four output datas? I find that as long as the depth range set during inference is inconsistent with that in training, errors will occur. Can I add your wechat for further communication?

cv1995 avatar Jul 05 '21 09:07 cv1995

These data are the aggregated scores (pose_score_volume and joint_score_volume) and the output depth volume from the neural network indicating the likelihood of the depth being at a certain depth level (pose_depth_volume and joint_depth_volume). You can visualize these data to see (1) if the aggregated scores are showing reasonable cross-view correspondences and (2) if the output from the neural network is producing high responses at the correct depth level.

You can directly email me at [email protected] if you would like to have further discussion.

jiahaoLjh avatar Jul 17 '21 11:07 jiahaoLjh

Hi. @cv1995 Did you manage to get better results for the 3d predictions? I am also having the same errors when using the provided pre-trained model in a multi-view scene of my own. The first thing I did was modify the depth range according to my scene, but the bad predictions continue regardless of whether I use a custom range or the same one used to train the model.

Thank you

AntonioEscamilla avatar Dec 23 '21 01:12 AntonioEscamilla

Hi, Antonio, I want get the best 3d pose prediction. I look into the code in shelf.py, it seems has many 3d pose candidates, and compared to the GT to get the best candidate. How to get the best candidate when I don't have GT?

stevechaw avatar Apr 26 '23 03:04 stevechaw