nerfacc icon indicating copy to clipboard operation
nerfacc copied to clipboard

Thank for your contribution!

Open limaolin2017 opened this issue 4 months ago • 11 comments

I had the following question the first time I ran this:

Error file:'Traceback (most recent call last): File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/_backend.py", line 53, in from nerfacc import csrc as _C ImportError: cannot import name 'csrc' from 'nerfacc' (/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/init.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1717, in _run_ninja_build subprocess.run( File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/subprocess.py", line 528, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/gpfs/home/mli/banmo/scripts/visualize/nvs.py", line 201, in app.run(main) File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/absl/app.py", line 312, in run run_main(main, args) File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/absl/app.py", line 258, in run_main sys.exit(main(argv)) File "/gpfs/home/mli/banmo/scripts/visualize/nvs.py", line 140, in main rendered_chunks = render_rays(nerf_models, File "/gpfs/home/mli/banmo/nnutils/rendering.py", line 132, in render_rays ray_indices, t_starts, t_ends = estimator.sampling( File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/estimators/occ_grid.py", line 164, in sampling intervals, samples, _ = traverse_grids( File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/grid.py", line 158, in traverse_grids t_mins, t_maxs, hits = ray_aabb_intersect(rays_o, rays_d, aabbs) File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/grid.py", line 43, in ray_aabb_intersect t_mins, t_maxs, hits = C.ray_aabb_intersect( File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/init.py", line 11, in call_cuda from .backend import C File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/backend.py", line 61, in C = load( File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1124, in load return jit_compile( File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1337, in jit_compile write_ninja_file_and_build_library( File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1449, in write_ninja_file_and_build_library run_ninja_build( File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1733, in run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension 'nerfacc_cuda': [1/6] /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/grid.cu -o grid.cuda.o FAILED: grid.cuda.o /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/grid.cu -o grid.cuda.o /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/grid.cu:4:10: fatal error: ATen/cuda/CUDAGeneratorImpl.h: No such file or directory #include <ATen/cuda/CUDAGeneratorImpl.h> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. [2/6] /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/pdf.cu -o pdf.cuda.o FAILED: pdf.cuda.o /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/pdf.cu -o pdf.cuda.o /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/pdf.cu:4:10: fatal error: ATen/cuda/CUDAGeneratorImpl.h: No such file or directory #include <ATen/cuda/CUDAGeneratorImpl.h> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. [3/6] g++ -MMD -MF nerfacc.o.d -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/nerfacc.cpp -o nerfacc.o [4/6] /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/camera.cu -o camera.cuda.o [5/6] /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/scan.cu -o scan.cuda.o ninja: build stopped: subcommand failed.',

info:' cuda 113 Torch 1.10 nerfacc 0.5.3 '

limaolin2017 avatar Feb 05 '24 05:02 limaolin2017

I see this error fatal error: ATen/cuda/CUDAGeneratorImpl.h: No such file or directory. For me this file lives in:

/home/ruilongli/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/include/ATen/cuda/CUDAGeneratorImpl.h

You might want to check if your torch is installed in this conda env correctly.

Alternately, you could install our prebuilt wheels from here

liruilong940607 avatar Feb 07 '24 18:02 liruilong940607

Hi, does the latest version of nerfacc support torch 1.10?

limaolin2017 avatar Feb 07 '24 21:02 limaolin2017

Yes

liruilong940607 avatar Feb 07 '24 23:02 liruilong940607

I changed another conda env.

env info:' cuda 11.3, Torch 1.11'

I encounter some errors:'Traceback (most recent call last): File "/gpfs/home/mli/banmo/scripts/visualize/nvs.py", line 201, in app.run(main) File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/absl/app.py", line 308, in run _run_main(main, args) File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/absl/app.py", line 254, in _run_main sys.exit(main(argv)) File "/gpfs/home/mli/banmo/scripts/visualize/nvs.py", line 140, in main rendered_chunks = render_rays(nerf_models, File "/gpfs/home/mli/banmo/nnutils/rendering.py", line 136, in render_rays ray_indices, t_starts, t_ends = estimator.sampling( File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/nerfacc/estimators/occ_grid.py", line 164, in sampling intervals, samples, _ = traverse_grids( File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/nerfacc/grid.py", line 165, in traverse_grids intervals, samples, termination_planes = _C.traverse_grids( File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/nerfacc/cuda/init.py", line 13, in call_cuda return getattr(_C, name)(*args, **kwargs) RuntimeError: CUDA error: an illegal memory access was encountered'

limaolin2017 avatar Feb 13 '24 10:02 limaolin2017

Could you dump the input of this function here and share it so I can take a look?

liruilong940607 avatar Feb 13 '24 16:02 liruilong940607

traverse_grids inputs: {'rays_o': tensor([[-0.2437, -0.0069, 0.0255], [-0.2437, -0.0069, 0.0255], [-0.2437, -0.0069, 0.0255], ..., [-0.2437, -0.0069, 0.0255], [-0.2437, -0.0069, 0.0255], [-0.2437, -0.0069, 0.0255]], device='cuda:0'), 'rays_d': tensor([[ 0.9839, 0.1560, -0.2869], [ 0.9840, 0.1560, -0.2848], [ 0.9841, 0.1559, -0.2827], ..., [ 0.9838, 0.1237, -0.3298], [ 0.9839, 0.1236, -0.3277], [ 0.9840, 0.1235, -0.3256]], device='cuda:0'), 'binaries': tensor([[[[False, False, False, ..., False, False, False], [False, False, False, ..., False, False, False], [False, False, False, ..., False, False, False], ..., [False, False, False, ..., False, False, False], [False, False, False, ..., False, False, False], [False, False, False, ..., False, False, False]],

     [[False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      ...,
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False]],

     [[False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      ...,
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False]],

     ...,

     [[False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      ...,
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False]],

     [[False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      ...,
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False]],

     [[False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      ...,
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False]]]]), 'aabbs': tensor([[0.0000, 0.0000, 0.0000, 0.3000, 0.3000, 0.3000]], device='cuda:0'), 'near_planes': tensor([0.2000, 0.2000, 0.2000,  ..., 0.2000, 0.2000, 0.2000], device='cuda:0'), 'far_planes': tensor([1., 1., 1.,  ..., 1., 1., 1.], device='cuda:0'), 'step_size': 0.001, 'cone_angle': 0.0}

limaolin2017 avatar Feb 22 '24 11:02 limaolin2017

I can't use these pasted outputs to examine the code.. Could you save them into a file (say .pth or .npz) and upload it here?

liruilong940607 avatar Feb 23 '24 23:02 liruilong940607

Thank you for checking!

inputs.pth.zip

limaolin2017 avatar Feb 27 '24 16:02 limaolin2017

I have checked the NaN values, shape, and type of input arguments, and there are no existing issues. Can you provide any suggestions on how to handle it?

limaolin2017 avatar Feb 27 '24 23:02 limaolin2017

I have checked the GPU memory usage, it is normal.

limaolin2017 avatar Mar 04 '24 09:03 limaolin2017

Hi I will check this issue after ECCV's ddl tmr!

liruilong940607 avatar Mar 07 '24 08:03 liruilong940607

Having the same issue on the sampling function, returns File "/home/ztlan/anaconda3/envs/nerf2/lib/python3.8/site-packages/nerfacc/estimators/occ_grid.py", line 164, in sampling intervals, samples, _ = traverse_grids( File "/home/ztlan/anaconda3/envs/nerf2/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/home/ztlan/anaconda3/envs/nerf2/lib/python3.8/site-packages/nerfacc/grid.py", line 165, in traverse_grids intervals, samples, termination_planes = _C.traverse_grids( File "/home/ztlan/anaconda3/envs/nerf2/lib/python3.8/site-packages/nerfacc/cuda/init.py", line 13, in call_cuda return getattr(_C, name)(*args, **kwargs) RuntimeError: CUDA error: an illegal memory access was encountered Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Is there any update on the issue? Thanks

ZitongLan avatar Apr 16 '24 02:04 ZitongLan

I have solved the issue. I didn't move my estimator to the GPU, so there is illegal memory access occurred.

ZitongLan avatar Apr 16 '24 19:04 ZitongLan