RGBD-Integration-2020 icon indicating copy to clipboard operation
RGBD-Integration-2020 copied to clipboard

Incorrect reconstruction result

Open jly0810 opened this issue 1 year ago • 7 comments

Excuse me, the following is the result of reconstruction using my dataset. The dataset includes three groups of RGBD images, but the reconstruction results do not coincide well. What is the reason? Is the camera in your sample dataset fixed, rotating only by people? Is this necessary? image image image The following is the original picture(RGB picture): A_hat1_color_frame0 A_hat1_color_frame1 A_hat1_color_frame2

I hope you can solve my problem. It has troubled me for a long time. Thank you!!

jly0810 avatar Oct 24 '22 10:10 jly0810

Hi @jly0810! In my experiment the camera was fixed, and only the person was rotating. However, you can do the opposite: the person can sit fixed, while the camera is moved around him/her.

  1. The main requirement is that change between adjacent frames should be small. Do you skip frames with parameter https://github.com/Ritchizh/RGBD-Integration-2020/blob/42cb66827580d2376ffcee5b60cd765e10ef2b5b/main__TSDF_Integrate__color_depth.py#L27 ? If yes, try decreasing it. In the frames above the change seems too large.
  2. The main issue with your frames, as far as I can see, is that you skipped segmentation of the subject step. You should delete the background behind your subject. I haven't published the code for segmentation. For every depth frame you should remove all pixels with distance value larger than a threshold. Or, alternatively, for each point cloud you can delete points with z coordinate larger than a threshold.

Ritchizh avatar Oct 24 '22 11:10 Ritchizh

Have you tried anything yet?

  1. You can try varying distance truncation parameter here: https://github.com/Ritchizh/RGBD-Integration-2020/blob/42cb66827580d2376ffcee5b60cd765e10ef2b5b/main__TSDF_Integrate__color_depth.py#L96
  2. I looked into the project, two years have passed and I can't remember which of the segmentation scripts I have used 😅 So I'll upload one that seems to be right. It is based on the fact that before capturing the subject, a background .bag is recorded.

Ritchizh avatar Oct 24 '22 20:10 Ritchizh

Have you tried anything yet?

  1. You can try varying distance truncation parameter here: https://github.com/Ritchizh/RGBD-Integration-2020/blob/42cb66827580d2376ffcee5b60cd765e10ef2b5b/main__TSDF_Integrate__color_depth.py#L96
  2. I looked into the project, two years have passed and I can't remember which of the segmentation scripts I have used 😅 So I'll upload one that seems to be right. It is based on the fact that before capturing the subject, a background .bag is recorded.

你好@jly0810! 在我的实验中,相机是固定的,只有人在旋转。但是,您可以做相反的事情:该人可以固定坐着,而相机围绕他/她移动。

  1. 主要要求是相邻帧之间的变化应该很小。你用参数跳过帧吗https://github.com/Ritchizh/RGBD-Integration-2020/blob/42cb66827580d2376ffcee5b60cd765e10ef2b5b/main__TSDF_Integrate__color_depth.py#L27 ? 如果是,请尝试减少它。在上面的框架中,变化似乎太大了。
  2. 据我所知,您的帧的主要问题是您跳过了主题步骤的分割。您应该删除主题背后的背景。 我还没有发布分段代码。对于每个深度帧,您应该删除距离值大于阈值的所有像素。或者,对于每个点云,您可以删除 z 坐标大于阈值的点。

Thank you for your reply.

  1. My dataset should not meet your requirement that there is little change between adjacent frames, or is there any specific standard? Does this condition mainly affect ICP operation? But,I never skip frames, that is, I always keep “skip_N_frames = 1” 2.Does your code require background culling? Is this a requirement? My goal is not limited to the reconstruction of portraits, so background removal is not necessary for me. And from my reconstruction results, I don't think whether background removal is the cause of incorrect results. It is more likely to be caused by incorrect external parameters. I'm not sure if my idea is correct, I hope you can correct it! I did not operate in this step, but I modified the truncation value, and the reconstruction result is still not correct.

Thanks again for your answer, thank you!

jly0810 avatar Oct 25 '22 01:10 jly0810

  1. If you look into ICP definition, you can see that it tries to find a rigid transformation (translation+rotation) that would tightly match 2 point clouds in space. So, the closer your adjacent point clouds - the easier it is to find this transform. You can try to tune ICP function parameters to make the alignment converge. ICP tries to find matching point pairs in the two point clouds (based on various criteria - the closest point; source point's normal ray intersection with destination surface etc). This means that the clearer are the tracked objects - the better. If you bring along the background wall plane it will surely affect alignment. In the Open3d tutorial example they have a point cloud of a chair with wall background - my guess is: it will work properly only if you move the camera, but don't move the chair relative to the wall. If the chair is moved - it is not clear what objects to match: either align the walls in 2 frames, or align the chairs.

  2. However, the first step of the algorithm is rough point clouds alignment with RANSAC: http://www.open3d.org/docs/release/tutorial/pipelines/global_registration.html?highlight=ransac Only after it, the more delicate ICP is used. I would recommend you take 2 of your point clouds and run RANSAC example on them from the link above - and see if they can be aligned.

  3. What sensor do you use to record data?

Ritchizh avatar Oct 25 '22 12:10 Ritchizh

  1. 如果您查看 ICP 定义,您会发现它试图找到与空间中的 2 个点云紧密匹配的刚性变换(平移+旋转)。因此,相邻点云越接近 - 越容易找到这种变换。您可以尝试调整 ICP 函数参数以使对齐收敛。 ICP 尝试在两个点云中找到匹配的点对(基于各种标准 - 最近的点;源点与目标表面的法线相交等)。这意味着被跟踪的对象越清晰 - 越好。**如果带背景墙平面肯定会影响对齐。**在 Open3d 教程示例中,他们有一个带有墙壁背景的椅子的点云 - 我的猜测是:它只有在您移动相机时才能正常工作,但不要相对于墙壁移动椅子。如果椅子被移动 - 不清楚要匹配哪些对象:要么将墙壁对齐 2 个框架,要么对齐椅子。
  2. 不过算法的第一步是粗略的点云与RANSAC对齐: http://www.open3d.org/docs/release/tutorial/pipelines/global_registration.html?highlight=ransac 只有在这之后,ICP才更细腻用过的。 我建议您从上面的链接中获取 2 个点云并在它们上运行 RANSAC 示例 - 看看它们是否可以对齐。
  3. 你用什么传感器来记录数据?

1、In my dataset, only the camera is moved, and objects in the scene do not move relative. The chair and the figure are regarded as one object, so I think there is no problem that you don't know which object to match 2、I'll try this later 3、The data set above is obtained in blender. In fact, I also got the external parameters of the camera. I tried in the following project, (https://github.com/andyzeng/tsdf-fusion-python)and the reconstruction results did not overlap. I studied for a long time and did not find the problem. I always thought it was the external parameters of the camera, so I found your code to try.In the previous work, I used the external parameters obtained from calibration as input. I also have the data captured by realsense, which can get correct results in the above project, but the results are incorrect here.

jly0810 avatar Oct 25 '22 12:10 jly0810

  1. It is strange that your data captured by RealSense fails here (is it RealSense D435?) Have you checked whether the intrinsics of your RealSense camera are same as mine? https://github.com/Ritchizh/RGBD-Integration-2020/blob/84f4e4fe8d8fa6e5deec50c3d16df5ebaf4de707/main__TSDF_Integrate__color_depth.py#L103-L108
color_stream = profile.get_stream(realsense.stream.color);
color_video_stream = rgb_stream .as('video_stream_profile');
color_intrinsic=depth_aligned_to_color_intrinsic = color_video_stream.get_intrinsic()

In this project it is assumed that you have aligned depth and color frames by means of pyrealsense when recording data (example), so extrinsic parameters are not needed. Only intrinsic parameters of the camera are used to create a point cloud.

Ritchizh avatar Oct 25 '22 15:10 Ritchizh

  1. 奇怪的是你的RealSense捕获的数据在这里失败了(是RealSense D435吗?)你检查过你的RealSense相机的内在函数是否和我的一样? https://github.com/Ritchizh/RGBD-Integration-2020/blob/84f4e4fe8d8fa6e5deec50c3d16df5ebaf4de707/main__TSDF_Integrate__color_depth.py#L103-L108
color_stream = profile.get_stream(realsense.stream.color);
color_video_stream = rgb_stream .as('video_stream_profile');
color_intrinsic=depth_aligned_to_color_intrinsic = color_video_stream.get_intrinsic()

在这个项目中,假设您在记录数据时已经通过 pyrealsense 对齐了深度和颜色帧(示例),因此不需要外部参数。只有相机的内在参数用于创建点云。 Hello, I have a question . What does the pose matrix in this function, that is “volume.integrate(rgbd, cameraIntrinsics, camera_poses[num_cam_pose].pose) ”,the external parameter matrix of the camera(camera_poses[num_cam_pose].pose), represent? Does it mean that the camera to the world (in other words : the coordinates under the camera coordinate system=posethe coordinates under the world coordinate system ) or the world to the camera (the coordinates under the world coordinate system=posethe coordinates under the camera coordinate system )? I hope you can solve my doubts,thanks.

jly0810 avatar Nov 03 '22 02:11 jly0810