calibrated-backprojection-network
calibrated-backprojection-network copied to clipboard
NYU performance
Hi Alex,
Thanks for your great work. I'm reproducing the NYU v2 (generalization) experiments. I followed the instructions to prepare NYU data provided by you. When I used your pre-trained model kbnet-void1500.pth
to evaluate on NYU v2,I got these errors:
(kbnet) zlq@ivg-SYS-7048GR-TR:/home/disk2/code/calibrated-backprojection-network$ bash bash/void/run_knet_nyu_v2_test.sh
usage: run_kbnet.py [-h] --image_path IMAGE_PATH --sparse_depth_path
SPARSE_DEPTH_PATH --intrinsics_path INTRINSICS_PATH
[--ground_truth_path GROUND_TRUTH_PATH]
[--input_channels_image INPUT_CHANNELS_IMAGE]
[--input_channels_depth INPUT_CHANNELS_DEPTH]
[--normalized_image_range NORMALIZED_IMAGE_RANGE [NORMALIZED_IMAGE_RANGE ...]]
[--outlier_removal_kernel_size OUTLIER_REMOVAL_KERNEL_SIZE]
[--outlier_removal_threshold OUTLIER_REMOVAL_THRESHOLD]
[--min_pool_sizes_sparse_to_dense_pool MIN_POOL_SIZES_SPARSE_TO_DENSE_POOL [MIN_POOL_SIZES_SPARSE_TO_DENSE_POOL ...]]
[--max_pool_sizes_sparse_to_dense_pool MAX_POOL_SIZES_SPARSE_TO_DENSE_POOL [MAX_POOL_SIZES_SPARSE_TO_DENSE_POOL ...]]
[--n_convolution_sparse_to_dense_pool N_CONVOLUTION_SPARSE_TO_DENSE_POOL]
[--n_filter_sparse_to_dense_pool N_FILTER_SPARSE_TO_DENSE_POOL]
[--n_filters_encoder_image N_FILTERS_ENCODER_IMAGE [N_FILTERS_ENCODER_IMAGE ...]]
[--n_filters_encoder_depth N_FILTERS_ENCODER_DEPTH [N_FILTERS_ENCODER_DEPTH ...]]
[--resolutions_backprojection RESOLUTIONS_BACKPROJECTION [RESOLUTIONS_BACKPROJECTION ...]]
[--n_filters_decoder N_FILTERS_DECODER [N_FILTERS_DECODER ...]]
[--deconv_type DECONV_TYPE]
[--min_predict_depth MIN_PREDICT_DEPTH]
[--max_predict_depth MAX_PREDICT_DEPTH]
[--weight_initializer WEIGHT_INITIALIZER]
[--activation_func ACTIVATION_FUNC]
[--min_evaluate_depth MIN_EVALUATE_DEPTH]
[--max_evaluate_depth MAX_EVALUATE_DEPTH]
[--output_path OUTPUT_PATH] [--save_outputs]
[--keep_input_filenames]
[--depth_model_restore_path DEPTH_MODEL_RESTORE_PATH]
[--device DEVICE]
run_kbnet.py: error: unrecognized arguments: --avg_pool_sizes_sparse_to_dense_pool 0 --encoder_type knet_v1 fusion_conv_previous sparse_to_dense_pool_v1 --input_type sparse_depth validity_map 3 3 3 0 --n_resolutions_encoder_intrinsics 0 1 2 3 --skip_types image depth --decoder_type multi-scale --output_kernel_size 3 --outlier_removal_method remove
So I deleted the unrecognized arguments and run it again, this time I got these numbers:
Evaluation results:
MAE RMSE iMAE iRMSE
122.836 228.426 24.147 50.003
+/- +/- +/- +/-
71.550 130.133 16.920 36.531
I know the numbers are close to the reported results in this repo, but I think maybe I can perfectly reproduce the results you reported if considering the unrecognized arguments. My question is: How can I reproduce the results which are closer to your reported results? Is there something wrong with my operation?
My environment is as follows:
torch 1.3.0
torchvision 0.4.1
Python 3.7.12
CUDA 10.2
Thank you in advance.
Hi, sorry that was an old run script that seemed to have survived the code clean up. Can you try using
https://github.com/alexklwong/calibrated-backprojection-network/blob/master/bash/void/run_kbnet_nyu_v2.sh
instead?
I will delete
https://github.com/alexklwong/calibrated-backprojection-network/blob/master/bash/void/run_knet_nyu_v2_test.sh
Hi, thanks for your quick reply. I used the https://github.com/alexklwong/calibrated-backprojection-network/blob/master/bash/void/run_kbnet_nyu_v2.sh.
However, the results seem to be the same.
(kbnet) zlq@ivg-SYS-7048GR-TR:/home/disk2/code/calibrated-backprojection-network$ bash bash/void/run_kbnet_nyu_v2.sh Input paths: testing/nyu_v2/nyu_v2_test_image_corner.txt testing/nyu_v2/nyu_v2_test_sparse_depth_corner.txt testing/nyu_v2/nyu_v2_test_intrinsics_corner.txt testing/nyu_v2/nyu_v2_test_ground_truth_corner.txt
Input settings: input_channels_image=3 input_channels_depth=2 normalized_image_range=[0.0, 1.0] outlier_removal_kernel_size=7 outlier_removal_threshold=1.50
Sparse to dense pooling settings: min_pool_sizes_sparse_to_dense_pool=[15, 17] max_pool_sizes_sparse_to_dense_pool=[23, 27, 29] n_convolution_sparse_to_dense_pool=3 n_filter_sparse_to_dense_pool=8
Depth network settings: n_filters_encoder_image=[48, 96, 192, 384, 384] n_filters_encoder_depth=[16, 32, 64, 128, 128] resolutions_backprojection=[0, 1, 2, 3] n_filters_decoder=[256, 128, 128, 64, 12] deconv_type=up min_predict_depth=0.10 max_predict_depth=8.00
Weight settings: n_parameter=6957764 n_parameter_depth=6957764 n_parameter_pose=0 weight_initializer=xavier_normal activation_func=leaky_relu
Evaluation settings: min_evaluate_depth=0.20 max_evaluate_depth=5.00
Checkpoint settings: checkpoint_path=pretrained_models/void/evaluation_results/nyu_v2
depth_model_restore_path=pretrained_models/void/kbnet-void1500.pth
Hardware settings: device=cuda n_thread=1
Evaluation results: MAE RMSE iMAE iRMSE 122.836 228.426 24.147 50.003 +/- +/- +/- +/- 71.550 130.133 16.920 36.531 Total time: 9457.15 ms Average time per sample: 14.46 ms Saving outputs to pretrained_models/void/evaluation_results/nyu_v2/outputs
Sorry I’m in Tel Aviv for ECCV this week so replies may be slow. Let me set up the repo and try.
Yeah, have a good week!
I'm trying to follow your work and build our model. Besides VOID and KITTI, I notice that the authors also reported the performance of KBNet on NYUv2, and could you share the checkpoint trained on NYUv2 when you have time?
For now, KBNet is the SOTA method in this field, so KBNet will undoubtedly be our main baseline model. I would really appreciate it if you could share the pre-trained model on NYUv2.
This is quite strange. I setup the repo from scratch and ran the void-1500 checkpoint on VOID1500 and NYUv2 test sets:
For VOID1500
Evaluation results:
MAE RMSE iMAE iRMSE
39.803 95.864 21.161 49.723
+/- +/- +/- +/-
27.521 67.776 24.340 62.204
Total time: 10159.95 ms Average time per sample: 12.70 ms
For NYUv2
Evaluation results:
MAE RMSE iMAE iRMSE
119.942 223.310 23.569 48.905
+/- +/- +/- +/-
69.832 128.695 16.550 35.989
Total time: 7262.26 ms Average time per sample: 11.10 ms
which is different from what you've obtained, and also different from the paper. Since NYUv2 as originally processed in MATLAB, I think there might be some processes tied to that platform that is causing the difference in the numbers. The differences are small so this might be some noise introduced in the data processing.