radii > 0 - CUDA error illegal access memory
Hi !
Currently trying to give a try to your method on a custom scene I have but I'm facing quite a disturbing and strange issue. Training starts well, but at some points, I got some CUDA-kernel error I cannot get.
I highly suspect an issue related to the gaussians.compute_3D_filter method (https://github.com/autonomousvision/mip-splatting/blob/746a17c9a906be256ed85b8fe18632f5d53e832d/train.py#L164C1-L165C1) but I don't manage to investigate in a deeper way the error.
I've found some similar issue in the original GS repo (here: https://github.com/graphdeco-inria/gaussian-splatting/issues/41#issuecomment-1752279620), made the corresponding changes by rebuilding the diff-gaussian-rasterizer submodule, but I still get the error.
Here is the log stack I get.
Do you have any clues / insights on what's happening and why ?
Thanks a lot for your time, your work,
Best,
Gaétan.
Hi, not sure what’s wrong. Do you find a solution in the end?
I meet the similar problem. Have you fixed it up?
Have you fixed it up?
I meet the similar problem. Have you fixed it up?
I fixed the problem by using a new WSL machine with cuda 118
I met the same issue here, and tried to do some troubleshooting but still cannot figure out why. I used WSL Ubuntu-22.04 machine with cuda 11.8. I tried to check the value of radii in gaussian_render/init.py but error occurred: File "/home/eksulfur/mip-splatting/gaussian_renderer/init.py", line 101, in render if torch.isnan(radii).any() or torch.isinf(radii).any(): RuntimeError: CUDA error: an illegal memory access was encountered Since radii was computed from means3D, means2D and several variables, I tried to check them by adding
` def check_tensor(name, tensor): if torch.isnan(tensor).any() or torch.isinf(tensor).any(): print(f"Warning: {name} contains NaN or Inf values.") else: print(f"{name} is valid.")
check_tensor("means3D", means3D)
check_tensor("means2D", means2D)
check_tensor("scales", scales)
check_tensor("rotations", rotations)
check_tensor("opacity", opacity)
check_tensor("background", bg_color)`
before
# Rasterize visible Gaussians to image, obtain their radii (on screen). rendered_image, radii = rasterizer( means3D = means3D, means2D = means2D, shs = shs, colors_precomp = colors_precomp, opacities = opacity, scales = scales, rotations = rotations, cov3D_precomp = cov3D_precomp)
and found that some of them are invalid: (mip-splatting) eksulfur@EkSulfurROG:~/mip-splatting$ CUDA_LAUNCH_BLOCKING=1 python ~/mip-splatting/train.py -s ~/ArcheGS-server/data/cup -m ~/ArcheGS-server/output/cup --iteration 10000 --save_iteration 5000 Optimizing /home/eksulfur/ArcheGS-server/output/cup Output folder: /home/eksulfur/ArcheGS-server/output/cup [16/02 13:07:41] Tensorboard not available: not logging progress [16/02 13:07:41] Reading camera 861/861 [16/02 13:07:42] Loading Training Cameras [16/02 13:07:42] Loading Test Cameras [16/02 13:07:48] Number of points at initialisation : 35639 [16/02 13:07:48] Computing 3D filter [16/02 13:07:48] Training progress: 0%| | 0/10000 [00:00<?, ?it/s]means3D is valid. [16/02 13:07:49] means2D is valid. [16/02 13:07:49] scales is valid. [16/02 13:07:49] rotations is valid. [16/02 13:07:49] Warning: opacity contains NaN or Inf values. [16/02 13:07:49] background is valid. [16/02 13:07:49] Warning: means3D contains NaN or Inf values. [16/02 13:07:49] means2D is valid. [16/02 13:07:49] Warning: scales contains NaN or Inf values. [16/02 13:07:49] Warning: rotations contains NaN or Inf values. [16/02 13:07:49] Warning: opacity contains NaN or Inf values. [16/02 13:07:49] background is valid. [16/02 13:07:49] Warning: means3D contains NaN or Inf values. [16/02 13:07:49] means2D is valid. [16/02 13:07:49] Warning: scales contains NaN or Inf values. [16/02 13:07:49] Warning: rotations contains NaN or Inf values. [16/02 13:07:49] Warning: opacity contains NaN or Inf values. [16/02 13:07:49]
here is the data I used:https://drive.google.com/drive/folders/1n9vjjXueqlh6cMpfSUlFZRJYIMoYhIE3?usp=sharing