mmdetection
mmdetection copied to clipboard
how to set multi gpu environment like dist_train.sh?
How can I set multi gpu environment in my program like multi gpu using torch.distributed.launch in dist_train.sh?
Even if the master port and master address are set in os.environ, local_rank is not repeated because the gpu is not caught.
Torch.cuda.device_count() catches 2 gpu, but os.environ environment catches 1.
Runtime environment: cudnn_benchmark: True mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0} dist_cfg: {'backend': 'nccl'} seed: None Distributed launcher: pytorch Distributed training: True GPU number: 1
hi @hoya-cho , I also have same problem with this. did you solve this problem?