DynamicDepth icon indicating copy to clipboard operation
DynamicDepth copied to clipboard

Some questions about warping and test.

Open whyygug opened this issue 2 years ago • 2 comments

Your paper is impressive and insightful, and thanks for your excellent work.

I got some question while reading your paper.

  1. Why do you get the synthetic I_{t-1} by inverse warping? It seems that the synthetic I_{t-1} could be directly produced by filling the image grids with the RGB pixels that have closer depth, just like the way in which you obtain the forward-warped depth map.

  2. Is forward warpping non-differentiable?

  3. Is the evaluation for the dynamic objects’s depth on KITTI done on the Eigen test set? If so, does each image in the Eigen test set have a ground truth semantic mask label? Or do you test the dynamic objects’s depth on KITTI using other splits in which each image has a ground truth semantic mask label?

Thanks.

whyygug avatar Apr 14 '23 15:04 whyygug

Hi:

Thank you for your interest!

1, Could you specify which part of the paper/code are you referring to?

2, Forward warping is differentiable but is a little bit tricky, multiple pixels may warp to the same grid.

3, I tested on the KITTI Eigen test set, the semantic mask is from the off-the-shelf instance segmentation model "Efficient-PS".

Thank you!

Sincerely, Ziyue Feng

fengziyue avatar Apr 14 '23 16:04 fengziyue

Thanks for your quick response.

  1. I'm referring to the function located at: https://github.com/AutoAILab/DynamicDepth/blob/93e374963e54d6d323484b557d2461d3d7b7d875/dynamicdepth/rigid_warp.py#L534

where the forward-warped depth map is obtained by: https://github.com/AutoAILab/DynamicDepth/blob/93e374963e54d6d323484b557d2461d3d7b7d875/dynamicdepth/rigid_warp.py#L569-L581

but the forward-warped image is obtained by: https://github.com/AutoAILab/DynamicDepth/blob/93e374963e54d6d323484b557d2461d3d7b7d875/dynamicdepth/rigid_warp.py#L591

Why do you get the forward-warped image by inverse warping? It seems that the forward-warped image could also be directly produced by forward-warping method, i.e., filling the image grids with the RGB pixels that have closer depth, just like the way in which you obtain the forward-warped depth map depth_w. The inverse warping inside forward warping seems redundant and increases computational costs.

  1. I know that you need to use a segmentation mask from "Efficient-PS" to get the disentangled image when inferring. I mean, what mask do you use to find the GT depth of dynamic objects when you are evaluating only the depth of the dynamic objects? Do you use the GT semantic mask to filter depth? If not, why don't you use the GT mask? Is it because not every image in the Eigen test set has a GT semantic mask label?

whyygug avatar Apr 15 '23 09:04 whyygug