DeepV2D icon indicating copy to clipboard operation
DeepV2D copied to clipboard

why scale depth_pred for evaluation?

Open flamehaze1115 opened this issue 3 years ago • 3 comments

Hello. I notice you scale both depth and pose estimation for evaluation. It's reasonable to scale pose same as previous works but it's unfair to scale depth_pred too since the ground truth depth is used in the loss function. Yours is a supervised depth estimation method, why you also scale the estimated depth?

flamehaze1115 avatar Aug 21 '20 11:08 flamehaze1115

Hi, we address the standard setting of SfM from multiple views, where methods reconstruct 3D up to a global scale. It is not possible to recover global scale using SfM. Existing methods which focus on the same problem, such as BA-Net and DeMoN, also scale the depth for evaluation, so we simply follow this setup. This allows us to compare to existing works such as DeMoN, BA-Net, and classical methods such as COLMAP using a consistent evaluation criteria.

All results reported in our paper use global depth scaling, so the evaluation is indeed fair. Some methods such as DORN and FCRN do not report scale matched metrics, so we download the pretrained models and evaluate them in our setting.

zachteed avatar Aug 21 '20 16:08 zachteed

Hi, we address the standard setting of SfM from multiple views, where methods reconstruct 3D up to a global scale. It is not possible to recover global scale using SfM. Existing methods which focus on the same problem, such as BA-Net and DeMoN, also scale the depth for evaluation, so we simply follow this setup. This allows us to compare to existing works such as DeMoN, BA-Net, and classical methods such as COLMAP using a consistent evaluation criteria.

All results reported in our paper use global depth scaling, so the evaluation is indeed fair. Some methods such as DORN and FCRN do not report scale matched metrics, so we download the pretrained models and evaluate them in our setting.

Yes. I know previous methods also report scale matched metrics. However, since you use gt depth to train the depthnet on a single dataset, the depthnet will try to learn the real scale like some single view depth estimation methods. Single methods never scale depth_pred for evaluation, so I cannot understand why the scale operation is needed. I evaluate your model on a video sequence of scannet, I found that the scale of per estimated depth varies from 0.7~1.5. Even if the scale has ambiguity, but it should be around a fixed value.

flamehaze1115 avatar Aug 21 '20 17:08 flamehaze1115

Our task setup is standard: reconstruct 3D up to scale, with access to ground truth depth in training. This is the same setup as prior work. What a method does internally does not affect the fairness of our evaluation. We included single view methods as reasonable baselines, but the main comparisons are with multi view methods. That said, you could solve a different task where 3D must be reconstructed exactly, and our method may be adapted to do this, but this was not the goal of our work.

zachteed avatar Aug 21 '20 17:08 zachteed