g2o icon indicating copy to clipboard operation
g2o copied to clipboard

Camera pose vs Camera Matrix

Open medakk opened this issue 6 years ago • 1 comments

I am trying to perform bundle adjustment in a problem where I know the location and rotation of the camera in world coordinate space(which I'll refer to as the pose of the camera). I use the inverse of this pose(the camera matrix) to triangulate feature points between frames.

Now, I'm a bit confused as to how to feed this into g2o. The SE3Quat seems to refer to the camera pose, but when it has to map a point p it does R*p + t. My question is, what should I use as input to g2o, considering all the transformations done by the camera matrix are translation first, then rotation.

I am currently using VertexSE3Expmap to represent the camera, VertexSBAPointXYZ for the triangulated points, and EdgeSE3ProjectXYZ for the edge between them.

medakk avatar Jun 10 '19 12:06 medakk

@medakk I have the same concern. if Camera pose is R,t, the correct way to compute projection is x = R-1.dot(X - t). x = RX + t (see EdgeProjectionXYZ::computeError, SE3Quat::map) does not make sense. But I have to say with R,t ground truth provided by TUM dataset, cam_pt3 = RX + t gives a point in camera space. I cannot figure out the cause but it seems that the result is correct. @RainerKuemmerle Could you help me to figure out what's wrong with my original idea?

Here is step to reproduce

...
# tracker._match 
# internally use cv2.estimatePose and results returned
R, t = resolvePose(...)

# note R, t from opencv2  just gives X2 = R.dot(X1) + t and they are not camera rotation and translation.

# current frame
#
# camera movement should be the inverse of [[R, t], [0, 1]] = [[R.inv(), -R.inv().dot(t)], [0, 1]]
#  see abstract algebra coordinates transformation
# 
self.cur.cam.Rwc = R.inv()
self.cur.cam.twc  = - R.inv().dot(t)

# tracker._ColdStartTrack
# Initialization method

# interal call opencv
points, cur_cam_pts, ref_cam_pts = self.triangulate(...)

# checking error
l = len(points)
for i in range(l):
 p = points[i]
...

if USE_GROUND_TRUTH:
 # now we have right scale and depth
 
# internally  I want to use "cam_pt = Rwc.inv(p - t)", which failed unexpectedly. However # cam_pt = Rwc.dot(p) + t seems to give me a good projection
 cam_pt = self.cur.cam.view(p)
 
 px1 = self.cur.cam.proj(cam_pt)

# using the R, t computed in the last step
 cam_pt_ref = R.dot(cam_pt) + t
 
 ...
...


What 's wrong with the codes converting world coordinates to camera space?

yiakwy avatar May 14 '20 03:05 yiakwy