LIGA-Stereo icon indicating copy to clipboard operation
LIGA-Stereo copied to clipboard

when I run these scripts,there're some questions

Open Xie-PC opened this issue 3 years ago • 5 comments

Thanks to your sharing,but when i first run following codes in my docker containers './scripts/dist_train.sh 1 dev configs/stereo/kitti_models/liga.yaml' or './scripts/dist_test_ckpt.sh 1 ./configs/stereo/kitti_models/liga.yaml ./ckpt/pretrained_liga.pth' nothing to show! If I cancle this processing by ctrl+c, run it again that will show '''bash Traceback (most recent call last): File "tools/train.py", line 211, in main() File "tools/train.py", line 73, in main args.tcp_port, args.local_rank, backend='nccl' File "/root/LIGA-Stereo-master/liga/utils/common_utils.py", line 181, in init_dist_pytorch world_size=num_gpus File "/root/miniconda3/envs/liga/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 422, in init_process_group store, rank, world_size = next(rendezvous_iterator) File "/root/miniconda3/envs/liga/lib/python3.7/site-packages/torch/distributed/rendezvous.py", line 126, in _tcp_rendezvous_handler store = TCPStore(result.hostname, result.port, world_size, start_daemon, timeout) RuntimeError: Address already in use Traceback (most recent call last): File "/root/miniconda3/envs/liga/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/root/miniconda3/envs/liga/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/root/miniconda3/envs/liga/lib/python3.7/site-packages/torch/distributed/launch.py", line 261, in main() File "/root/miniconda3/envs/liga/lib/python3.7/site-packages/torch/distributed/launch.py", line 257, in main cmd=cmd) subprocess.CalledProcessError: Command '['/root/miniconda3/envs/liga/bin/python', '-u', 'tools/train.py', '--local_rank=0', '--launcher', 'pytorch', '--fix_random_seed', '--sync_bn', '--save_to_file', '--cfg_file', 'configs/stereo/kitti_models/liga.yaml', '--exp_name', 'dev']' returned non-zero exit status 1. ''' How should I solve it?

Xie-PC avatar Nov 21 '21 14:11 Xie-PC

I met the same problem, and I solve it by change liga.yaml to liga.3d-and-bev.yaml. ./scripts/dist_train.sh 1 dev configs/stereo/kitti_models/liga.3d-and-bev.yaml

WeiSQ-zju avatar Nov 22 '21 08:11 WeiSQ-zju

thanks for you reply, but what you say is useless for me, I still have this question.

Xie-PC avatar Nov 22 '21 08:11 Xie-PC

The python program is not completely killed. Try to find the pid and kill (or killall python if you only run this python program.)

xy-guo avatar Nov 22 '21 11:11 xy-guo

I met the same problem, and I solve it by change liga.yaml to liga.3d-and-bev.yaml. ./scripts/dist_train.sh 1 dev configs/stereo/kitti_models/liga.3d-and-bev.yaml

Hi, did you face any problems like "cannot import ** from mmcv.cnn"? I installed mmcv-full and mmdet. I also tried different versions of mmcv and mmdet, I didn't find one can run the test/train model. Any suggestions would be helpful. Thanks a lot!

monstre0731 avatar Mar 21 '22 02:03 monstre0731

@QingwuLiu-polymtl Could you show the complete error log? It should be good if you have installed mmcv-full instead of mmcv.

xy-guo avatar Apr 09 '22 06:04 xy-guo