YOLOX icon indicating copy to clipboard operation
YOLOX copied to clipboard

ncclUnhandledCudaError: Call to CUDA function failed.

Open bo-bobo opened this issue 3 years ago • 3 comments

Traceback (most recent call last): File "/home/psdz/anaconda3/envs/yolox/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap fn(i, *args) File "/home/psdz/YOLOX/yolox/core/launch.py", line 91, in _distributed_worker comm.synchronize() File "/home/psdz/YOLOX/yolox/utils/dist.py", line 48, in synchronize dist.barrier() File "/home/psdz/anaconda3/envs/yolox/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 2524, in barrier work = default_pg.barrier(opts=opts) RuntimeError: NCCL error in: /pytorch/torch/lib/c10d/ProcessGroupNCCL.cpp:38, unhandled cuda error, NCCL version 2.7.8

bo-bobo avatar Jul 27 '21 08:07 bo-bobo

https://github.com/Megvii-BaseDetection/YOLOX/issues/147

issue as above:

i met the same bug。and i am working on it. can you help me out?

oliverwxg avatar Jul 27 '21 10:07 oliverwxg

i met the same issue during train the yolox_nano

ladyxuxu avatar Jul 05 '22 02:07 ladyxuxu

i met the same issue during train the yolox_nano

ladyxuxu avatar Jul 05 '22 02:07 ladyxuxu