FoundationPose Incompatible depth format

I am getting the same error as issue #86

python run_demo.py 
Warp 1.0.2 initialized:
   CUDA Toolkit 11.5, Driver 12.3
   Devices:
     "cpu"      : "x86_64"
     "cuda:0"   : "NVIDIA GeForce RTX 4060 Laptop GPU" (8 GiB, sm_89, mempool enabled)
   Kernel cache:
     /home/utsav/.cache/warp/1.0.2
[__init__()] self.cfg: 
 lr: 0.0001
c_in: 6
zfar: 'Infinity'
debug: null
n_view: 1
run_id: 3wy8qqex
use_BN: true
exp_name: 2024-01-11-20-02-45
n_epochs: 62
save_dir: /home/bowenw/debug/2024-01-11-20-02-45/
use_mask: false
loss_type: pairwise_valid
optimizer: adam
batch_size: 64
crop_ratio: 1.1
enable_amp: true
use_normal: false
max_num_key: null
warmup_step: -1
input_resize:
- 160
- 160
max_step_val: 1000
vis_interval: 1000
weight_decay: 0
normalize_xyz: true
resume_run_id: null
clip_grad_norm: 'Infinity'
lr_epoch_decay: 500
render_backend: nvdiffrast
train_num_pair: 5
lr_decay_epochs:
- 50
n_epochs_warmup: 1
make_pair_online: false
gradient_max_norm: 'Infinity'
max_step_per_epoch: 10000
n_rendering_workers: 1
save_epoch_interval: 100
n_dataloader_workers: 100
split_objects_across_gpus: true
ckpt_dir: /home/utsav/IProject/FoundationPose/learning/training/../../weights/2024-01-11-20-02-45/model_best.pth

[__init__()] self.h5_file:None
[__init__()] Using pretrained model from /home/utsav/IProject/FoundationPose/learning/training/../../weights/2024-01-11-20-02-45/model_best.pth
[__init__()] init done
[__init__()] welcome
[__init__()] self.cfg: 
 lr: 0.0001
c_in: 6
zfar: .inf
debug: null
w_rot: 0.1
n_view: 1
run_id: null
use_BN: true
rot_rep: axis_angle
ckpt_dir: /home/utsav/IProject/FoundationPose/learning/training/../../weights/2023-10-28-18-33-37/model_best.pth
exp_name: 2023-10-28-18-33-37
save_dir: /tmp/2023-10-28-18-33-37/
loss_type: l2
optimizer: adam
trans_rep: tracknet
batch_size: 64
crop_ratio: 1.2
use_normal: false
BN_momentum: 0.1
max_num_key: null
warmup_step: -1
input_resize:
- 160
- 160
max_step_val: 1000
normal_uint8: false
vis_interval: 1000
weight_decay: 0
n_max_objects: null
normalize_xyz: true
clip_grad_norm: 'Infinity'
rot_normalizer: 0.3490658503988659
trans_normalizer:
- 0.019999999552965164
- 0.019999999552965164
- 0.05000000074505806
max_step_per_epoch: 25000
val_epoch_interval: 10
n_dataloader_workers: 60
enable_amp: true
use_mask: false

[__init__()] self.h5_file:
[__init__()] Using pretrained model from /home/utsav/IProject/FoundationPose/learning/training/../../weights/2023-10-28-18-33-37/model_best.pth
[__init__()] init done
[reset_object()] self.diameter:0.013781790921357066, vox_size:0.003
[reset_object()] self.pts:torch.Size([35, 3])
[reset_object()] reset done
[make_rotation_grid()] cam_in_obs:(42, 4, 4)
[make_rotation_grid()] rot_grid:(252, 4, 4)
num original candidates = 252
num of pose after clustering: 252
[make_rotation_grid()] after cluster, rot_grid:(252, 4, 4)
[make_rotation_grid()] self.rot_grid: torch.Size([252, 4, 4])
[<module>()] estimator initialization done
[<module>()] i:0
[register()] Welcome
Module Utils load on device 'cuda:0' took 5.51 ms
Traceback (most recent call last):
  File "/home/utsav/IProject/FoundationPose/run_demo.py", line 52, in <module>
    pose = est.register(K=reader.K, rgb=color, depth=depth, ob_mask=mask, iteration=args.est_refine_iter)
  File "/home/utsav/IProject/FoundationPose/estimater.py", line 173, in register
    depth = erode_depth(depth, radius=2, device='cuda')
  File "/home/utsav/IProject/FoundationPose/Utils.py", line 390, in erode_depth
    wp.launch(kernel=erode_depth_kernel, device=device, dim=[depth.shape[0], depth.shape[1]], inputs=[depth_wp, out_wp, radius, depth_diff_thres, ratio_thres, zfar],)
  File "/home/utsav/anaconda3/envs/foundationpose/lib/python3.9/site-packages/warp/context.py", line 4240, in launch
    pack_args(fwd_args, params)
  File "/home/utsav/anaconda3/envs/foundationpose/lib/python3.9/site-packages/warp/context.py", line 4212, in pack_args
    params.append(pack_arg(kernel, arg_type, arg_name, a, device, adjoint))
  File "/home/utsav/anaconda3/envs/foundationpose/lib/python3.9/site-packages/warp/context.py", line 3972, in pack_arg
    raise RuntimeError(
RuntimeError: Error launching kernel 'erode_depth_kernel', argument 'depth' expects an array with 2 dimension(s) but the passed array has 3 dimension(s).

The model, pointcloud and depth are in mm scale, please find the attached link to my files https://drive.google.com/drive/folders/1N6HkHg1ASygzplz7SEjm5KxJSVvV5jMC?usp=sharing

Since this is an surgical application, I have to use the scale in mm.

May 12 '24 01:05 utsavrai

The format of depth image was wrong, I have updated the file of depth map. I would still like to get help regarding how to tackle the problem when we are dealing with depth in mm. As you can see the results are not good. It will be very helpful to know what changes can be done to the depth map scale or any other suggestion. Thank you

May 12 '24 12:05 utsavrai

you can divide depth image (in mm) by 1000 to convert to meter. Is this your question?

May 17 '24 05:05 wenbowen123