PoseCNN icon indicating copy to clipboard operation
PoseCNN copied to clipboard

Projection Scaling issue

Open TotoLulu94 opened this issue 6 years ago • 6 comments

Hi @yuxng ,

I tried to test poseCNN on my own images, I have strange results : figure_2

The poses seem to be well estimated but it seems that there is a scaling issue. If I did understand your paper, this probably come from the fact that the bounding boxes (shown with the labels) are mispredicted. But I don't really get why. To record my images, I also used a Asus Xtion Pro Live RGB-D camera with the following intrinsic matrix : " [[528, 0, 320] [0, 528, 240], [0, 0, 1]] " I then tied to test it with another intrinsic matrix (as if I was working with a 1280x960 image) : figure_1

The projections look better but the estimated translation vector looks off (just by considering the bleach and mustard). The intrinsic matrix used for this is : " [[978.9, 0, 320], [0, 978.9, 240], [0, 0, 1]] "

Do you have an Idea about what would cause this ? Is my first intrinsic matrix right and I am just missing a scaling factor ? Or is my 2nd matrix right and the estimation of the bleach pose is just bad ?

Also, these estimations have been made without ICP refinement.

TotoLulu94 avatar Aug 14 '18 17:08 TotoLulu94

The network is trained with the specific intrinsic matrix in the YCB_Video dataset. So the distance of the object that the network predicts is related to that intrinsic matrix. You can try to visualize the results using that intrinsic matrix.

yuxng avatar Aug 14 '18 18:08 yuxng

Using the same intrinsic matrix as in demo.py, I have a result comparable to my second case. I guess that sense since the intrinsic matrix of my camera is different. Thanks for you reply !

TotoLulu94 avatar Aug 14 '18 18:08 TotoLulu94

I am going to retrain the network using my own intrinsic matrix but I have another question regarding that : Since the intrinsic matrix of my camera and the intrincix matrix of the camera used to capture the dataset, how will that affect the results ?

TotoLulu94 avatar Aug 15 '18 13:08 TotoLulu94

You might have missed a verb or there is a typo, but your question doesn't make sense to me @TotoLulu94 Although I guess you were asking if it is possible to retrain with owns own intrinsic matrix in case the training dataset got recorded with that camera? Is that your question? If yes I would also be interested in the answer.

Kaju-Bubanja avatar Aug 15 '18 16:08 Kaju-Bubanja

Yep that's right :

  • Since the intrinsic matrix of my camera and the intrincix matrix of the camera used to capture the dataset are different, how will that affect the results ?

TotoLulu94 avatar Aug 15 '18 17:08 TotoLulu94

@TotoLulu94 I am seeing a similar scaling issue as you have mentioned when using a Kinect with a different intrinsic matrix. The predicted translation vector seems to be off by 10-20 cm. Did retraining on YCB video dataset with your own intrinsic matrix solve your problem ?

aditya2592 avatar Nov 15 '18 22:11 aditya2592