mmtracking
mmtracking copied to clipboard
Error when using dist_train/dist_test
Hello! Sorry for disturbing again, but I have new problems, and it confuse me a lot.
It appears that dist_test/dist_train cannot work.
When I run the dist_test.sh using the command:
bash ./tools/dist_test.sh configs/mot/tracktor/tracktor_faster-rcnn_r50_fpn_4e_mot17-public-half.py 2 --eval track
I got the error below:
TypeError: can't pickle _thread.RLock objects
return Popen(process_obj)
File "/usr/local/miniconda3/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 32, in init
super().init(process_obj)
File "/usr/local/miniconda3/lib/python3.6/multiprocessing/popen_fork.py", line 19, in init
self._launch(process_obj)
File "/usr/local/miniconda3/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/usr/local/miniconda3/lib/python3.6/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle _thread.RLock objects
Traceback (most recent call last):
File "/usr/local/miniconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/local/miniconda3/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/miniconda3/lib/python3.6/site-packages/torch/distributed/launch.py", line 253, in
But when I test the model using one-gpu command: python ./tools/test.py configs/mot/tracktor/tracktor_faster-rcnn_r50_fpn_4e_mot17-public-half.py --eval track It successfully works. Could you plz help me solve the problem? Thanks a lot!
BTW, I use pytorch1.3, cuda 10.0, mmcv 1.2.6, mmdet 2.8.0, python 3.6
How many GPUs do you have on your machine?
Can you try the other python versions? Like 3.7 or 3.8?
I have 2 gpus on my machine.
@gsygsygsy123 Have no idea if you have fixed this, but we could hardly do error tracing with the limited information your provide. I would recommend you to try with Pytorch 1.5+ and post your environment information and the full stack of error tracing if the error still happens.