packnet-sfm How to reproduce the result on DDAD

Hi, Thank you for releasing the code. I am trying to train the packet on DDAD. But I can not reproduce the result so far. I use 8 v100 gpus. The training command is 'CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 horovodrun -np 8 -H localhost:8 python scripts/train.py ./configs/train_ddad.yaml' . The details of my config are as follows: model: name: 'SelfSupModel' optimizer: name: 'Adam' depth: lr: 0.00009 pose: lr: 0.00009 scheduler: name: 'StepLR' step_size: 30 gamma: 0.5 depth_net: name: 'PackNet01' version: '1A' pose_net: name: 'PoseNet' version: '' params: crop: '' min_depth: 0.0 max_depth: 200.0 datasets: augmentation: image_shape: (384, 640) train: batch_size: 8 num_workers: 8 dataset: ['DGP'] path: ['/data/ddad_train_val/ddad.json'] split: ['train'] depth_type: ['lidar'] cameras: [['camera_01']] repeat: [5] validation: num_workers: 8 dataset: ['DGP'] path: ['/data/ddad_train_val/ddad.json'] split: ['val'] depth_type: ['lidar'] cameras: [['camera_01']] test: num_workers: 8 dataset: ['DGP'] path: ['/data/ddad_train_val/ddad.json'] split: ['val'] depth_type: ['lidar'] cameras: [['camera_01']] checkpoint: filepath: './data/experiments' monitor: 'abs_rel_pp_gt' monitor_index: 0 mode: 'min'

[0]:| [2m[1m[32mE: 50 BS: 8 - SelfSupModel LR (Adam): Depth 4.50e-05 Pose 4.50e-05[0m | [0]:|| [0]:| METRIC | abs_rel | sqr_rel | rmse | rmse_log | a1 | a2 | a3 | [0]:|| [0]:| [1m[35m*** /data/ddad_train_val/ddad.json/val (camera_01) [0m | [0]:|*********************************************************************************************| [0]:| [36mDEPTH | 0.853 | 23.485 | 37.371 | 2.022 | 0.002 | 0.005 | 0.008 [0m | [0]:| [36mDEPTH_PP | 0.853 | 23.542 | 37.468 | 2.025 | 0.002 | 0.004 | 0.008 [0m | [0]:| [36mDEPTH_GT | 0.268 | 12.451 | 19.267 | 0.333 | 0.705 | 0.869 | 0.936 [0m | [0]:| [36mDEPTH_PP_GT | 0.257 | 11.199 | 18.532 | 0.324 | 0.709 | 0.873 | 0.939 [0m |

Are there any problems? Thank you for your attention.

May 21 '21 10:05 csBob123

Hmm, can you try a few things:

Start from a pre-trained model (e.g. a KITTI model) to see if it diverges
Try another network (DepthResNet or PoseResNet)
Play around with the learning rate

By the way, once you get some numbers you can try submitting to our EvalAI DDAD challenge! https://eval.ai/web/challenges/challenge-page/902/overview

May 21 '21 15:05 VitorGuizilini-TRI

Hmm, can you try a few things:
* Start from a pre-trained model (e.g. a KITTI model) to see if it diverges

* Try another network (DepthResNet or PoseResNet)

* Play around with the learning rate
By the way, once you get some numbers you can try submitting to our EvalAI DDAD challenge! https://eval.ai/web/challenges/challenge-page/902/overview

Do you use any pre-trained weights to get the result 0.173(abs_rel) on DDAD and 0.111(abs_rel) on KITTI? Or just train from scratch?

May 21 '21 16:05 csBob123

No, those are trained from scratch with PackNet. I just mentioned pre-trained weights as a way to see if there is anything wrong with the training setup that you are using.

May 26 '21 15:05 VitorGuizilini-TRI

Hi, Thanks for your work. Was the results on DDAD produced by training from scratch using the config setup provided here? https://github.com/TRI-ML/packnet-sfm/blob/master/configs/train_ddad.yaml

Jun 01 '21 11:06 a1600012888

@a1600012888 Yes, that configuration file should work.

Jun 01 '21 16:06 VitorGuizilini-TRI

@a1600012888 Yes, that configuration file should work.

Thanks!

Jun 02 '21 16:06 a1600012888

Hi, Thanks for your work. Was the results on DDAD produced by training from scratch using the config setup provided here? https://github.com/TRI-ML/packnet-sfm/blob/master/configs/train_ddad.yaml

Hi, for DDAD experiments, Did you train the model using 8 gpu cards with this config file? If so, does that means the effective batch size is 8*2=16, and learning rate is 9e-5?

Jun 03 '21 14:06 a1600012888

packnet-sfm packnet-sfm copied to clipboard

How to reproduce the result on DDAD

packnet-sfm
packnet-sfm copied to clipboard