Depth-Anything Depth-Anything Small encoder for metric depth estimation?

What steps would you recommend taking for using the small depth anything relative depth estimation model with the metric depth estimation pipeline? Will it need to be re-trained or should I be able to swap them and it still work somewhat correctly?

Feb 08 '24 20:02 Denny-kef

I finetuned a metric depth model on custom dataset. From my tinkering with depth anything codebase:

To switch encoder: You need to modify build function as shown below:

https://github.com/LiheYoung/Depth-Anything/blob/e7ef4b4b7a0afd8a05ce9564f04c1e5b68268516/metric_depth/zoedepth/models/base_models/depth_anything.py#L334

Replace depth_anything = DPT_DINOv2(encoder='vitl', out_channels=[256, 512, 1024, 1024], use_clstoken=False) By depth_anything = DPT_DINOv2(encoder='vits', features=64, out_channels=[48, 96, 192, 384], use_clstoken=False)

On the dataset side:

You need to accommodate your dataset to follow the preprocessing and augmentation done in https://github.com/LiheYoung/Depth-Anything/blob/e7ef4b4b7a0afd8a05ce9564f04c1e5b68268516/metric_depth/zoedepth/data/data_mono.py#L292

If your depth GT needs custom scaling as shown below then apply such scaling and you should be good to go. https://github.com/LiheYoung/Depth-Anything/blob/e7ef4b4b7a0afd8a05ce9564f04c1e5b68268516/metric_depth/zoedepth/data/data_mono.py#L353

Feb 13 '24 09:02 mvish7

@mvish7 Can you just swap out the encoder and not finetune the model and it still work?

Feb 15 '24 15:02 Denny-kef

Hi As the base DepthAnything model is trained for relative depth, just swapping the encoder won't produce metric depth. From my experience within 3 epochs of fine tuning the model has learned the depth scale of our custom dataset.

Feb 15 '24 15:02 mvish7

@mvish7 Hi, I want to ask how many pairs of rgb-depth have you used for fintuning. I have used 100 pairs of my own data and trained 50 epochs, but the result seems not accurate. I also want to know if the parameter --pretrained_resource="" in train_mono.py is depth_anything_metric_depth_outdoor.pt? If I use depth_anything_vitl14.pth, there will be a mismatch error in state_dict.

Feb 27 '24 06:02 cosmosmosco

Hi @cosmosmosco, the argument --pretrained_resource="" is for loading a pre-trained checkpoint (containing entire model parameters). Please just set it as an empty string when launching your training script.

Mar 17 '24 10:03 LiheYoung

Hi @cosmosmosco, the argument --pretrained_resource="" is for loading a pre-trained checkpoint (containing entire model parameters). Please just set it as an empty string when launching your training script.

Thank you. It really helps.

Mar 18 '24 08:03 cosmosmosco

@mvish7 @cosmosmosco I am trying to finetune the zoedepth+depthanything model on a custom outdoor dataset for which I do have the pixel-wise gt. Can you please throw some light on data preparation, which config to use, and how best to use the script along with the changes in the script and the specific arguments for a custom dataset? Thanks in advance!

Mar 25 '24 20:03 abhishek0696

Depth-Anything Depth-Anything copied to clipboard

Depth-Anything Small encoder for metric depth estimation?

Depth-Anything
Depth-Anything copied to clipboard