OpenSfM
OpenSfM copied to clipboard
[Help for Project] Reprojection using point cloud and json of OpenSfM
Good afternoon, everyone
Wanted to see if it were possible for anyone to help me on a project I’m doing using OpenSfM. My objective is to make a projection from 3D to 2D, using .ply Point Cloud and write the respective matching pixel on an image. The problem I have right now is scaling.
The image above represents 10% of the points from the 1.6 million vertices point cloud. You can see that it has the shape of the pillar, but the scaling is all wrong. I’m using reconstruction.json from the undistorted folder and using EXIF library to export information from the images to create the camera intrinsic matrix.
Below, there's a zip with image, reconstruction.json and the ipynb code (also the code in .txt form):
I appreciate any help. Thank you in advance and have a wonderful weekend.
Hi @JACMoutinho,
First of all, great that you even got so far :D Puh, there are a few things that could be wrong.
- Do you transform from whatever world coordinate system the PCL lives in to your image? You'll need to transform first (from world to camera:
T_cam_world
)and then project into your image with the camera matrixK
, something likep_img = K*T_cam_world* point_3d
. - Another thing is that we typically use normalized coordinates, so you have to convert to the actual image pixels first. There are methods in the code for that
- Is there some scaling involved in the pipeline? Your camera parameters might be scaled
Best, Fabian
Hello @fabianschenk
First of all, let me just thank you for your quick answer and taking your time to help me out.
- Yes, for the K i use the metadata taken with the EXIF directly from the image (specifically the focal length and the (width, height)
`f = img.focal_length C_x = int(image_src.shape[1]/2) C_y = int(image_src.shape[0]/2)
width = int(image_src.shape[1])
height = int(image_src.shape[0])
f_x = f*img.x_resolution
f_y = f*img.y_resolution
int_matrix = np.array([[f_x, 0, C_x],
[0, f_y, C_y],
[0, 0, 1]])`
For the T_cam_world i use the information give by the reconstruction.json file in the undistorted folder of the openSfM to retrieve the Rotation and translation matrices from the camera poses, using then cv2.Rodrigues to get a 3x3 rotation matrix from the 1x3 rotation vector.
`for i,img_name in enumerate(images):
rotation = shots[img_name]['rotation']
translation = shots[img_name]['translation']
gps_position = shots[img_name]['gps_position']
rows_shots.append([img_name[:-4],rotation,translation,gps_position])
img_data = pd.DataFrame(rows_shots, columns=["Image","Rotation","Translation","GPS Position"])`
I then use that information to create my 4x4 projection matrix
`rotation_matrix_13 = img_data[img_data['Image'] == img_test]['Rotation'].to_numpy()[0]
translation_matrix = np.matrix(img_data[img_data['Image'] == img_test]['Translation'].to_numpy()[0]).T
rotation_matrix = cv2.Rodrigues(np.matrix(rotation_matrix_13))[0]
RT_matrix = np.append(rotation_matrix,translation_matrix,1)
RT_matrix4x4 = np.vstack((RT_matrix,[0,0,0,1]))
ones = np.array([[1,0,0,0], [0,1,0,0], [0,0,1,0]])
P = int_matrix.dot(ones).dot(RT_matrix4x4)
coords2D = P.dot(test_coord.T)
coords2D = np.array(coords2D.T)
u = coords2D[0] / coords2D[2]
v = coords2D[1] / coords2D[2]`
For the 3D coordinates i use the merged.ply generated by the OpenSfM
`PLY_file = "merged.ply" vertices = [] with open(PLY_file, 'r') as f: line = f.readline().strip()
num_vertices = None
while line != 'end_header':
elements = line.split()
if elements[0] == 'element' and elements[1] == 'vertex':
num_vertices = int(elements[2])
line = f.readline().strip()
for _ in range(num_vertices):
line = f.readline().strip()
vertex = list(map(float, line.split()))
vertices.append(vertex)`
- In terms of normalization, i use the pure values from the .ply, which i assume has the coordinates of the "world" in which the point cloud is built on (i guess a better way to put it, is that the coordinates follow its own coordinate system implemented by the OpenSfM).
my process to convert back to 2D is, taking into consideration the code i presented in point 1:
P = K*[R|t]
P = int_matrix.dot(ones).dot(RT_matrix4x4)
coords2D = P.dot(test_coord.T)
coords2D = np.array(coords2D.T)
u = coords2D[0] / coords2D[2]
v = coords2D[1] / coords2D[2]`
- In termos of camera matrix, i use the Focal Length in milimeters and use the resolution to transform that value in pixels. Same for the center C_x and C_y.
`f = img.focal_length C_x = int(image_src.shape[1]/2) C_y = int(image_src.shape[0]/2)
width = int(image_src.shape[1])
height = int(image_src.shape[0])
f_x = f*img.x_resolution
f_y = f*img.y_resolution
int_matrix = np.array([[f_x, 0, C_x],
[0, f_y, C_y],
[0, 0, 1]])`
That's the only scaling i'm using. for the camera matrix, i'm only using the metadata from the image file.
I'm sorry if i'm all over the place with this explanation and i wasn't able to reply to your questions as you expected me to, i am fairly new on this subject.
EDIT:
For consideration, can it be that translation and rotation from the reconstruction.json file uses a different reference than the one from the point cloud? if so, is there a default scaling value in the pipeline that i could use to "renormalize"/"denormalize" for 2D coordinates or 3D coordinates?
Hi @JACMoutinho ,
The code you shared seems correct even though I didn't check all the details. Can you try to read the 3D points from the recontruction.json
instead of the ply
file?
Maybe you can also look through the code for examples on how to reproject. We're not working actively on OpenSfM anymore and the last time I used it was around 1,5 - 2 years ago, so I also don't remember all the details anymore.
Good luck, Fabian