EcoDepth icon indicating copy to clipboard operation
EcoDepth copied to clipboard

Inference with outdoor model

Open maxrapp01 opened this issue 9 months ago • 3 comments

Hello, great documentation which is appreciated.

When I run the indoor model on NYU I get great results (100 images of NYU): configs:

EcoDepth/checkpoints/weights_indoor.ckpt Namespace(max_depth=10.0, train_from_scratch=False, ckpt_path='/mnt/mh_grupp/EcoDepth/checkpoints/weights_indoor.ckpt', pred_only=True, flip_test=False, grayscale=False, eval_crop='no_crop', no_of_classes=100, scene='indoor')

Results:

abs_rel: 0.0478 sq_rel: 0.0205 rmse: 0.1984 rmse_log: 0.0689 a1: 0.9735 a2: 0.9954 a3: 0.9985

Outdoor model on KITTI, incorrect results (100 images of KITTI): Configs:

EcoDepth/checkpoints/weights_outdoor.ckpt Namespace(max_depth=80.0, train_from_scratch=False, ckpt_path='/mnt/mh_grupp/EcoDepth/checkpoints/weights_outdoor.ckpt', pred_only=True, flip_test=False, grayscale=False, eval_crop='no_crop', no_of_classes=200, scene='outdoor')

Results:

abs_rel: 0.8739 sq_rel: 12.4502 rmse: 17.6043 rmse_log: 2.0843 a1: 0.0000 a2: 0.0000 a3: 0.0000

I call the model the same way, with boths datasets and always with tensors with batch dimension: Configs:

with torch.no_grad():
    for image in images:
        depth = model(image.unsqueeze(0)).squeeze()
        preds.append(depth)

Am I missing some pre-processing which is not needed with the indoor model but is necessary for outdoor model? I tried looking in infer_image.py however nothing I tried gave reasonable results. So calling model the exact same way for NYU and KITTI gives good and bad results respectively, what's the correct way for inference with the outdoor model? Thanks.

maxrapp01 avatar Apr 02 '25 14:04 maxrapp01

Hi,

Can you try visualizing the images? My suspicion is that the scale factor is incorrect since the a{1,2,3} values are coming 0. If this is indeed the case, then the depth maps should look fine.

You also seem to be computing the metrics using custom code. You have to be careful while loading the ground truth depth maps and ensure that they are converted to the metric space. Specifically, you need to divide the raw depth by a factor of 256. You can take a look at version 1.0.0 for reference.

Aradhye2002 avatar Apr 02 '25 15:04 Aradhye2002

Hey @Aradhye2002 Here's the ground truth I'm comparing against:

Image

And the predicted depth from the model:

Image

Computing metrics works well with other models like UniDepth, and actually I just trained the EcoDepth outdoor model on KITTI overnight and the results are reasonable (after 3 epochs, 100 images of KITTI):

abs_rel: 0.0527 sq_rel: 0.1773 rmse: 2.5260 rmse_log: 0.0807 a1: 0.9725 a2: 0.9970 a3: 0.9995

The results from above is from calling (the now fine-tune) outdoor model the same way as before, which works for the indoor model, fine tuned model, but not the original outdoor model:

with torch.no_grad():
    for image in images:
        depth = model(image.unsqueeze(0)).squeeze()
        preds.append(depth)

Where each image is a tensor in range [0,1], for example:

Image dtype: torch.float32 Image shape: torch.Size([1, 3, 375, 1242]) Image min: 0.0 Image max: 1.0

It may be due to the pre-processing you are doing and resizing in infer_image.py however when I used your infer code even for the indoor model on NYU the results became way worse.

Here's the resulting depth map from the fine-tuned outdoor model, which is good:

Image

I'm guessing the outdoor model works as expected for you?

maxrapp01 avatar Apr 03 '25 07:04 maxrapp01

any thoughts @Aradhye2002 ?

maxrapp01 avatar Apr 08 '25 06:04 maxrapp01