Depth-Anything icon indicating copy to clipboard operation
Depth-Anything copied to clipboard

Significance of Focal Parameter

Open mvish7 opened this issue 1 year ago • 2 comments

Hello, Thanks for open sourcing this amazing work. I'm interested in fine-tuning DepthAnything for metric depth.

I checked the metric_depth/train_test_inputs/kitti_eigen_train_files_with_gt.txt file and saw the focal parameter. E.g.

2011_09_26/2011_09_26_drive_0057_sync/image_02/data/0000000116.png 2011_09_26_drive_0057_sync/proj_depth/groundtruth/image_02/0000000116.png 721.5377

2011_10_03/2011_10_03_drive_0034_sync/image_02/data/0000000978.png 2011_10_03_drive_0034_sync/proj_depth/groundtruth/image_02/0000000978.png 718.856

During dataloading this parameter is read from the files and send along with the image, ground truth in a batch. E.g. https://github.com/LiheYoung/Depth-Anything/blob/f419b7db90b26b2855280c4da484778c4fac759f/metric_depth/zoedepth/data/data_mono.py#L294

However I do not see "focal" being used anywhere during forward pass or loss calculations.

During evaluate.py, focal is hard-coded but never used in any calculations. e.g. focal = sample.get('focal', torch.Tensor( [715.0873]).cuda()) # This magic number (focal) is only used for evaluating BTS model pred = infer(model, image, dataset=sample['dataset'][0], focal=focal)

Could you please clarify:

Is focal parameter really necessary for finetuning depth anything for metric depth usecase? How to calculate the focal parameter for a dataset from e.g. camera information? or what information i need to calculate focal parameter for each image?

Thanks

mvish7 avatar Feb 07 '24 15:02 mvish7

I changed focal length during metric depth estimation (kitti) and saw no impact.

hgolestaniii avatar Feb 15 '24 15:02 hgolestaniii

The focal length seems to be useful when using BTS for getting the metric data. As they have kinda reused the code from BTS, they have left it there. In this inference (Depth-Anything/metric_depth), ZoeDepth is used instead of BTS and it doesn't use focal length for inference. @LiheYoung can confirm this information.

anveshrddy avatar Mar 07 '24 21:03 anveshrddy