转换成coco数据集格式的时候,标签起始id是0还是1?
如果是1的话出现:
opt/conda/envs/detrv3/lib/python3.10/site-packages/torch/nn/modules/conv.py:456: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at /opt/conda/conda-bld/pytorch_1712608935911/work/aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
return F.conv2d(input, weight, bias, self.stride,
/opt/conda/conda-bld/pytorch_1712608935911/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [102,0,0], thread: [123,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1712608935911/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [102,0,0], thread: [124,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1712608935911/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [0,0,0], thread: [68,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1712608935911/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [0,0,0], thread: [69,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1712608935911/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [0,0,0], thread: [71,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed.
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/D-FINE/train.py", line 109, in TORCH_USE_CUDA_DSA to enable device-side assertions.
E0211 08:09:32.650000 140312401397568 torch/distributed/elastic/multiprocessing/api.py:826] failed (exitcode: -6) local_rank: 0 (pid: 1210) of binary: /opt/conda/envs/detrv3/bin/python
Traceback (most recent call last):
File "/opt/conda/envs/detrv3/bin/torchrun", line 33, in
0的话,不收敛,检测的结果不对
同问啊,我也是这个问题
如果是1的话出现: opt/conda/envs/detrv3/lib/python3.10/site-packages/torch/nn/modules/conv.py:456: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at /opt/conda/conda-bld/pytorch_1712608935911/work/aten/src/ATen/native/cudnn/Conv_v8.cpp:919.) return F.conv2d(input, weight, bias, self.stride, /opt/conda/conda-bld/pytorch_1712608935911/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [102,0,0], thread: [123,0,0] Assertion
-sizes[i] <= index && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1712608935911/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [102,0,0], thread: [124,0,0] Assertion-sizes[i] <= index && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1712608935911/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [0,0,0], thread: [68,0,0] Assertion-sizes[i] <= index && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1712608935911/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [0,0,0], thread: [69,0,0] Assertion-sizes[i] <= index && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1712608935911/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [0,0,0], thread: [71,0,0] Assertion-sizes[i] <= index && index < sizes[i] && "index out of bounds"failed. [rank0]: Traceback (most recent call last): [rank0]: File "/home/D-FINE/train.py", line 109, in [rank0]: main(args) [rank0]: File "/home/D-FINE/train.py", line 53, in main [rank0]: solver.fit() [rank0]: File "/home/D-FINE/src/solver/det_solver.py", line 63, in fit [rank0]: train_stats = train_one_epoch( [rank0]: File "/home/D-FINE/src/solver/det_engine.py", line 63, in train_one_epoch [rank0]: loss_dict = criterion(outputs, targets, **metas) [rank0]: File "/opt/conda/envs/detrv3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl [rank0]: return self._call_impl(*args, **kwargs) [rank0]: File "/opt/conda/envs/detrv3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl [rank0]: return forward_call(*args, **kwargs) [rank0]: File "/home/D-FINE/src/zoo/dfine/dfine_criterion.py", line 238, in forward [rank0]: indices = self.matcher(outputs_without_aux, targets)['indices'] [rank0]: File "/opt/conda/envs/detrv3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl [rank0]: return self._call_impl(*args, **kwargs) [rank0]: File "/opt/conda/envs/detrv3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl [rank0]: return forward_call(*args, **kwargs) [rank0]: File "/opt/conda/envs/detrv3/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context [rank0]: return func(*args, **kwargs) [rank0]: File "/home/D-FINE/src/zoo/dfine/matcher.py", line 102, in forward [rank0]: cost_bbox = torch.cdist(out_bbox, tgt_bbox, p=1) [rank0]: File "/opt/conda/envs/detrv3/lib/python3.10/site-packages/torch/functional.py", line 1335, in cdist [rank0]: return _VF.cdist(x1, x2, p, None) # type: ignore[attr-defined] [rank0]: RuntimeError: CUDA error: device-side assert triggered [rank0]: CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. [rank0]: For debugging consider passing CUDA_LAUNCH_BLOCKING=1. [rank0]: Compile withTORCH_USE_CUDA_DSAto enable device-side assertions.E0211 08:09:32.650000 140312401397568 torch/distributed/elastic/multiprocessing/api.py:826] failed (exitcode: -6) local_rank: 0 (pid: 1210) of binary: /opt/conda/envs/detrv3/bin/python Traceback (most recent call last): File "/opt/conda/envs/detrv3/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.3.0', 'console_scripts', 'torchrun')()) File "/opt/conda/envs/detrv3/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 347, in wrapper return f(*args, **kwargs) File "/opt/conda/envs/detrv3/lib/python3.10/site-packages/torch/distributed/run.py", line 879, in main run(args) File "/opt/conda/envs/detrv3/lib/python3.10/site-packages/torch/distributed/run.py", line 870, in run elastic_launch( File "/opt/conda/envs/detrv3/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 132, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/opt/conda/envs/detrv3/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 263, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
0的话,不收敛,检测的结果不对 也遇到这个问题了,请问你解决了嘛?
标签起始是0吧。我这里转换后的都map达到和yolo一样的小于