torch_efficient_distloss
torch_efficient_distloss copied to clipboard
RuntimeError: Error building extension 'segment_cumsum_cuda'
Hi ! I'm trying to run Block-NeRF and I faced this error:
Using /root/.cache/torch_extensions/py310_cu116 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /root/.cache/torch_extensions/py310_cu116/segment_cumsum_cuda/build.ninja...
Building extension module segment_cumsum_cuda...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] c++ segment_cumsum.o segment_cumsum_kernel.cuda.o -shared -L/opt/conda/lib/python3.10/site-packages/torch/lib -lc10 -lc10_cuda -ltorch_cpu -ltorch_cuda_cu -ltorch_cuda_cpp -ltorch -ltorch_python -L/opt/conda/lib64 -lcudart -o segment_cumsum_cuda.so
FAILED: segment_cumsum_cuda.so
c++ segment_cumsum.o segment_cumsum_kernel.cuda.o -shared -L/opt/conda/lib/python3.10/site-packages/torch/lib -lc10 -lc10_cuda -ltorch_cpu -ltorch_cuda_cu -ltorch_cuda_cpp -ltorch -ltorch_python -L/opt/conda/lib64 -lcudart -o segment_cumsum_cuda.so
/usr/bin/ld: cannot find -lcudart
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
0%| | 0/100000 [00:04<?, ?it/s]
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1900, in _run_ninja_build
subprocess.run(
File "/opt/conda/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/docker_block_nerf/Block_NeRF/run_FourierGrid.py", line 115, in <module>
run_train(args, cfg, data_dict, export_cam=True, export_geometry=True)
File "/home/docker_block_nerf/Block_NeRF/FourierGrid/run_train.py", line 382, in run_train
psnr = scene_rep_reconstruction(
File "/home/docker_block_nerf/Block_NeRF/FourierGrid/run_train.py", line 274, in scene_rep_reconstruction
loss_distortion = flatten_eff_distloss(w, s, 1/n_max, ray_id)
File "/opt/conda/lib/python3.10/site-packages/torch_efficient_distloss/eff_distloss.py", line 93, in forward
segment_cumsum_cuda = load(
File "/opt/conda/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load
return _jit_compile(
File "/opt/conda/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1508, in _jit_compile
_write_ninja_file_and_build_library(
File "/opt/conda/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1623, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/opt/conda/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1916, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'segment_cumsum_cuda'
(base) root@user:/home/docker_block_nerf/Block_NeRF#
That is RuntimeError: Error building extension 'segment_cumsum_cuda'
. How can I resolve it ?
Hi, I also had a similar issue; here is how I fixed it:
- Start with a clean Conda env (I guess Python env would also work, wouldn't hurt to try).
- First things first install all the CUDA Runtime API stuff you will need. Nvidia provides the links here; if you use a Python env you could try the pip version.).
I suppose here one thing is important, designate the CUDA version your Torch uses; for me it was CUDA 11.7 therefore I used the following:
conda install cuda -c nvidia/label/cuda-11.7.0
. - Install your compatible Torch, for me:
pip install torch==2.0.0+cu117 torchvision==0.15.1+cu117 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu117
- Install torch_efficient_loss:
pip install torch_efficient_distloss
Hope this helps.
Cheers,
Hi, I also had a similar issue; here is how I fixed it:
- Start with a clean Conda env (I guess Python env would also work, wouldn't hurt to try).
- First things first install all the CUDA Runtime API stuff you will need. Nvidia provides the links here; if you use a Python env you could try the pip version.). I suppose here one thing is important, designate the CUDA version your Torch uses; for me it was CUDA 11.7 therefore I used the following:
conda install cuda -c nvidia/label/cuda-11.7.0
.- Install your compatible Torch, for me:
pip install torch==2.0.0+cu117 torchvision==0.15.1+cu117 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu117
- Install torch_efficient_loss:
pip install torch_efficient_distloss
Hope this helps.
Cheers,
I still have this problem after following this instruction. Do you add cuda 11.7 installed using conda into the library path? or something else I can try?