gaussian-splatting icon indicating copy to clipboard operation
gaussian-splatting copied to clipboard

RuntimeError: CUDA error: an illegal memory access was encountered

Open seohoiki3215 opened this issue 2 years ago • 23 comments
trafficstars

Hello, I was surprised by your work and tried to reproduce it with the code you've provided. However, every time I tried to run the code, it always failed to run with the runtime error i mentioned on the title.

Traceback (most recent call last): File "train.py", line 213, in training(lp.extract(args), op.extract(args), pp.extract(args), args.test_iterations, args.save_iterations, args.checkpoint_iterations, args.start_checkpoint) File "train.py", line 87, in training loss = (1.0 - opt.lambda_dssim) * Ll1 + opt.lambda_dssim * (1.0 - ssim(image, gt_image)) File "/home/seohoiki/Research/NeRF/gaussian-splatting/utils/loss_utils.py", line 38, in ssim window = window.cuda(img1.get_device()) RuntimeError: CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Training progress: 0%| | 0/30000 [00:00<?, ?it/s]

I tried all the methods you've told in other issues, but failed. My system & settings: RTX4090 Ubuntu 22.04 LTS Exact environment with given .yml file

Strangely, my colleague who has system with RTX 3090 / Ubuntu 20.04 runs the code without any problem.(Except them, all the settings are exactly the same including CUDA SDK version)

I hope I can get some solution for this problem!

Thank you.

===================================== Results with cuda-memcheck

========= CUDA-MEMCHECK ========= This tool is deprecated and will be removed in a future release of the CUDA toolkit ========= Please use the compute-sanitizer tool as a drop-in replacement Optimizing Output folder: ./output/54877260-0 [17/07 19:21:51] Tensorboard not available: not logging progress [17/07 19:21:51] Found transforms_train.json file, assuming Blender data set! [17/07 19:21:51] Reading Training Transforms [17/07 19:21:51] Reading Test Transforms [17/07 19:21:53] Loading Training Cameras [17/07 19:21:56] Loading Test Cameras [17/07 19:21:57] Number of points at initialisation : 100000 [17/07 19:21:57] Training progress: 0%| | 0/30000 [00:00<?, ?it/s]Traceback (most recent call last): File "train.py", line 213, in training(lp.extract(args), op.extract(args), pp.extract(args), args.test_iterations, args.save_iterations, args.checkpoint_iterations, args.start_checkpoint) File "train.py", line 87, in training loss = (1.0 - opt.lambda_dssim) * Ll1 + opt.lambda_dssim * (1.0 - ssim(image, gt_image)) File "/home/seohoiki/Research/NeRF/gaussian-splatting/utils/loss_utils.py", line 38, in ssim window = window.cuda(img1.get_device()) RuntimeError: CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Training progress: 0%| | 0/30000 [00:00<?, ?it/s] ========= ERROR SUMMARY: 0 errors

seohoiki3215 avatar Jul 17 '23 10:07 seohoiki3215