Depth-Anything
Depth-Anything copied to clipboard
Significance of Focal Parameter
Hello, Thanks for open sourcing this amazing work. I'm interested in fine-tuning DepthAnything for metric depth.
I checked the metric_depth/train_test_inputs/kitti_eigen_train_files_with_gt.txt
file and saw the focal parameter. E.g.
2011_09_26/2011_09_26_drive_0057_sync/image_02/data/0000000116.png 2011_09_26_drive_0057_sync/proj_depth/groundtruth/image_02/0000000116.png 721.5377
2011_10_03/2011_10_03_drive_0034_sync/image_02/data/0000000978.png 2011_10_03_drive_0034_sync/proj_depth/groundtruth/image_02/0000000978.png 718.856
During dataloading this parameter is read from the files and send along with the image, ground truth in a batch. E.g. https://github.com/LiheYoung/Depth-Anything/blob/f419b7db90b26b2855280c4da484778c4fac759f/metric_depth/zoedepth/data/data_mono.py#L294
However I do not see "focal" being used anywhere during forward pass or loss calculations.
During evaluate.py
, focal is hard-coded but never used in any calculations. e.g.
focal = sample.get('focal', torch.Tensor( [715.0873]).cuda()) # This magic number (focal) is only used for evaluating BTS model
pred = infer(model, image, dataset=sample['dataset'][0], focal=focal)
Could you please clarify:
Is focal parameter really necessary for finetuning depth anything for metric depth usecase? How to calculate the focal parameter for a dataset from e.g. camera information? or what information i need to calculate focal parameter for each image?
Thanks
I changed focal length during metric depth estimation (kitti) and saw no impact.
The focal length seems to be useful when using BTS for getting the metric data. As they have kinda reused the code from BTS, they have left it there. In this inference (Depth-Anything/metric_depth), ZoeDepth is used instead of BTS and it doesn't use focal length for inference. @LiheYoung can confirm this information.