RuntimeError: CUDA error: an illegal memory access was encountered
Checklist
- [X] I have searched for similar issues.
- [X] I have tested with the latest development wheel.
- [X] I have checked the release documentation and the latest documentation (for
mainbranch).
My Question
running the following script (o3dml) abhnegi@tghfg~/o3dml/Open3D-ML$ python scripts/run_pipeline.py torch -c ml3d/configs/pointpillars_kitti.yml --split test --dataset.dataset_path "/pfs/rdi/cea/rdicea_vru/01_Datasets/Kitti/" --pipeline ObjectDetection --dataset.use_cache True
gives me this error :
Traceback (most recent call last):
File "/pfs/rdi/cea/home/abhnegi/o3dml/Open3D-ML/scripts/run_pipeline.py", line 261, in TORCH_USE_CUDA_DSA to enable device-side assertions.
same issue with:
发生异常: RuntimeError (note: full exception trace is shown but execution is paused at: _run_module_as_main)
CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
File "/home/lz/Codes/occ_seg_interpolation/.venv/lib/python3.10/site-packages/torch/_ops.py", line 755, in __call__
return self._op(*args, **(kwargs or {}))
File "/home/lz/Codes/occ_seg_interpolation/.venv/lib/python3.10/site-packages/open3d/ml/torch/python/ops.py", line 1210, in voxelize
*_torch.ops.open3d.voxelize(points=points,
File "/home/lz/Codes/occ_seg_interpolation/src/map/voxel_block.py", line 205, in grid_subsample
) = ml3d.ops.voxelize(
File "/home/lz/Codes/occ_seg_interpolation/src/seg_occ.py", line 217, in main
) = grid_subsample(
File "/home/lz/Codes/occ_seg_interpolation/.venv/lib/python3.10/site-packages/viztracer/decorator.py", line 78, in wrapper
ret = func(*args, **kwargs)
File "/home/lz/Codes/occ_seg_interpolation/src/batch_seg_occ.py", line 145, in single_process
seg_occ_main(
File "/home/lz/Codes/occ_seg_interpolation/src/batch_seg_occ.py", line 177, in main
single_process(input_dir, prelabel_input_dir, adrn, vis_diff)
File "/home/lz/Codes/occ_seg_interpolation/src/batch_seg_occ.py", line 189, in <module>
main(
File "/home/lz/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/lz/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/runpy.py", line 196, in _run_module_as_main (Current frame)
return _run_code(code, main_globals, None,
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
it constantly occurs for the second call to ml3d.ops.voxelize where import open3d.ml.torch as ml3d where as the first calling is always fine