PaddleDetection icon indicating copy to clipboard operation
PaddleDetection copied to clipboard

rtdetr训练不了

Open Waynepoo opened this issue 2 years ago • 1 comments

问题确认 Search before asking

  • [X] 我已经搜索过问题,但是没有找到解答。I have searched the question and found no related answer.

请提出你的问题 Please ask your question

training on single-GPU

export CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/rtdetr/rtdetr_r50vd_6x_coco.yml --eval 报错如下: File "/usr/local/lib/python3.7/dist-packages/decorator.py", line 232, in fun return caller(func, *(extras + args), **kw) File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/wrapped_decorator.py", line 25, in impl return wrapped_func(*args, **kwargs) File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/framework.py", line 434, in impl return func(*args, **kwargs) File "/usr/local/lib/python3.7/dist-packages/paddle/tensor/creation.py", line 189, in to_tensor stop_gradient=stop_gradient) OSError: (External) CUDA error(719), unspecified launch failure. [Hint: 'cudaErrorLaunchFailure'. An exception occurred on the device while executing a kernel. Common causes include dereferencing an invalid device pointerand accessing out of bounds shared memory. Less common cases can be system specific - more information about these cases canbe found in the system specific user guide. This leaves the process in an inconsistent state and any further CUDA work willreturn the same error. To continue using CUDA, the process must be terminated and relaunched.] (at /paddle/paddle/phi/backends/gpu/cuda/cuda_info.cc:258) 环境:尝试过自己安装官方给的步骤安装、docker镜像:paddlecloud/paddledetection:2.4-gpu-cuda11.2-cudnn8-latest、paddlecloud/paddledetection:2.4-gpu-cuda10.2-cudnn7-latest,三种方式报错都一样

Waynepoo avatar Jul 03 '23 02:07 Waynepoo

是不是和你的cuda版本不一致?

lyuwenyu avatar Jul 10 '23 09:07 lyuwenyu