DirectVoxGO
DirectVoxGO copied to clipboard
Building CUDA Extension
Hello authors,
Thank you for your great work and your code. I am trying to run the model, which involves building the CUDA extension. I am aware that issue #13 exists, but it does not provide information that solves my issue, so I am opening a new issue.
I am using CUDA 11.6 with PyTorch built for CUDA 11.6. I have successfully installed pytorch-scatter
and all the dependencies in requirements.txt
. When I run python run.py --config configs/nerf/lego.py --render_test
, I get:
>>> python run.py --config configs/nerf/lego.py --render_test
Using /tmp/torch-ext as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /tmp/torch-ext/adam_upd_cuda/build.ninja...
Building extension module adam_upd_cuda...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] /usr/local/cuda/bin/nvcc -ccbin /software/compilers/gcc-5.4.0/bin/gcc -DTORCH_EXTENSION_NAME=adam_upd_cuda -DTORCH_API_INCLUDE_EXTENSI
ON_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -isystem /users/user/miniconda
3/envs/new2/lib/python3.9/site-packages/torch/include -isystem /users/user/miniconda3/envs/new2/lib/python3.9/site-packages/torch/include/t
orch/csrc/api/include -isystem /users/user/miniconda3/envs/new2/lib/python3.9/site-packages/torch/include/TH -isystem /users/user/minicon
da3/envs/new2/lib/python3.9/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /users/user/miniconda3/envs/new2/incl
ude/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__C
UDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-opt
ions '-fPIC' -std=c++14 -c /users/user/projects/experiments/active/DirectVoxGO/lib/cuda/adam_upd_kernel.cu -o adam_upd_kernel.cuda.o
FAILED: adam_upd_kernel.cuda.o
/usr/local/cuda/bin/nvcc -ccbin /software/compilers/gcc-5.4.0/bin/gcc -DTORCH_EXTENSION_NAME=adam_upd_cuda -DTORCH_API_INCLUDE_EXTENSION_H -
DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -isystem /users/user/miniconda3/envs
/new2/lib/python3.9/site-packages/torch/include -isystem /users/user/miniconda3/envs/new2/lib/python3.9/site-packages/torch/include/torch/c
src/api/include -isystem /users/user/miniconda3/envs/new2/lib/python3.9/site-packages/torch/include/TH -isystem /users/user/miniconda3/en
vs/new2/lib/python3.9/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /users/user/miniconda3/envs/new2/include/py
thon3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO
_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '
-fPIC' -std=c++14 -c /users/user/projects/experiments/active/DirectVoxGO/lib/cuda/adam_upd_kernel.cu -o adam_upd_kernel.cuda.o
/users/user/projects/experiments/active/DirectVoxGO/lib/cuda/adam_upd_kernel.cu: In lambda function:
/users/user/projects/experiments/active/DirectVoxGO/lib/cuda/adam_upd_kernel.cu:74:116: warning: ‘T* at::Tensor::data() const [with T = dou
ble]’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [-Wdeprecated-declarations]
/users/user/miniconda3/envs/new2/lib/python3.9/site-packages/torch/include/ATen/core/TensorBody.h:236:1: note: declared here
T * data() const {
^
/users/user/projects/experiments/active/DirectVoxGO/lib/cuda/adam_upd_kernel.cu:74:141: warning: ‘T* at::Tensor::data() const [with T = dou
ble]’ is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [-Wdeprecated-declarations]
/users/user/miniconda3/envs/new2/lib/python3.9/site-packages/torch/include/ATen/core/TensorBody.h:236:1: note: declared here
T * data() const {
...
^
/users/user/miniconda3/envs/new2/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/nn/cloneable.h: In instantiation of
‘std::shared_ptr<torch::nn::Module> torch::nn::Cloneable<Derived>::clone(const c10::optional<c10::Device>&) const [with Derived = torch::nn:
:CrossMapLRN2dImpl]’:
/tmp/tmpxft_0000447a_00000000-6_adam_upd_kernel.cudafe1.stub.c:59:27: required from here
/users/user/miniconda3/envs/new2/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/nn/cloneable.h:58:59: error: invali
d static_cast from type ‘const torch::OrderedDict<std::basic_string<char>, at::Tensor>’ to type ‘torch::OrderedDict<std::basic_string<char>,
at::Tensor>&’
/users/user/miniconda3/envs/new2/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/nn/cloneable.h:71:61: error: invali
d static_cast from type ‘const torch::OrderedDict<std::basic_string<char>, std::shared_ptr<torch::nn::Module> >’ to type ‘torch::OrderedDict<
std::basic_string<char>, std::shared_ptr<torch::nn::Module> >&’
/users/user/miniconda3/envs/new2/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/nn/cloneable.h: In instantiation of
‘std::shared_ptr<torch::nn::Module> torch::nn::Cloneable<Derived>::clone(const c10::optional<c10::Device>&) const [with Derived = torch::nn:
:EmbeddingBagImpl]’:
...
/users/user/miniconda3/envs/new2/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/nn/cloneable.h:58:59: error: invali
d static_cast from type ‘const torch::OrderedDict<std::basic_string<char>, at::Tensor>’ to type ‘torch::OrderedDict<std::basic_string<char>,
at::Tensor>&’
/users/user/miniconda3/envs/new2/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/nn/cloneable.h:71:61: error: invali
d static_cast from type ‘const torch::OrderedDict<std::basic_string<char>, std::shared_ptr<torch::nn::Module> >’ to type ‘torch::OrderedDict<
std::basic_string<char>, std::shared_ptr<torch::nn::Module> >&’
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/users/user/miniconda3/envs/new2/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1865, in _run_ninja_build
subprocess.run(
File "/users/user/miniconda3/envs/new2/lib/python3.9/subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/users/user/projects/experiments/active/DirectVoxGO/run.py", line 13, in <module>
from lib import utils, dvgo, dcvgo, dmpigo
File "/users/user/projects/experiments/active/DirectVoxGO/lib/utils.py", line 11, in <module>
from .masked_adam import MaskedAdam
File "/users/user/projects/experiments/active/DirectVoxGO/lib/masked_adam.py", line 7, in <module>
adam_upd_cuda = load(
File "/users/user/miniconda3/envs/new2/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1257, in load
return _jit_compile(
File "/users/user/miniconda3/envs/new2/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1480, in _jit_compile
_write_ninja_file_and_build_library(
File "/users/user/miniconda3/envs/new2/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1594, in _write_ninja_file_and_bui
ld_library
_run_ninja_build(
File "/users/user/miniconda3/envs/new2/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1881, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'adam_upd_cuda'
Your assistance on this issue would be greatly appreciated! Thank you again.
Hello, I just wanted to follow up on this and give an update. I have tried with CUDA 11.1, with matching versions of CUDA, PyTorch, and pytorch-scatter. I still get the same error.
Hmm strange. My machine is torch==1.10.1+cu111
too and it works well.
Could you please provide more detail about your nvcc
version and which version of the DVGO do you use?
Besides, have you do any modification to the c++/cuda codes?
It's strange that the backend functions use OrderedDict to access torch::Tensor
.
Thanks for the response!
I am using the main branch of DVGO (I just cloned, installed the dependencies, and tried to run run.py
). I have not modified the code at all. Here is the result of nvcc -V
:
>>> nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0
I'm using Python 3.9 and an NVIDIA A40. What GPU/Python/versions are you using?
Thanks so much for your help!
On my side, my nvcc -V
is
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Tue_Sep_15_19:10:02_PDT_2020
Cuda compilation tools, release 11.1, V11.1.74
Build cuda_11.1.TC455_06.29069683_0
I'm using Python 3.9, torch==1.10.1+cu111
.
My GPU is RTX 2080Ti with CUDA 11.1.
I found a similar issue in other repos.
It seems that gcc version matters too.
My gcc --version
is
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
Thanks for the investigation! I'm using gcc 5.4.0
, like in the issue that you linked. I will install a higher version of gcc and try again!
Hi everyone I am having the same issue and I haven't figured it out why ? any help ?
I installed pytorch through this command:
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116
gcc version on my computer:
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 Copyright (C) 2017 Free Software Foundation, Inc.
g++ version on my computer:
g++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 Copyright (C) 2017 Free Software Foundation, Inc.
and nvcc version is
nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Tue_Mar__8_18:36:24_Pacific_Standard_Time_2022 Cuda compilation tools, release 11.6, V11.6.124 Build cuda_11.6.r11.6/compiler.31057947_0
Any ideas on how to solve this ? I am on a Ubuntu 18.04 with 16gb memory and RTX Tian. Using python 3.7.5.
Any help is appreciated !
P.S.: I have tried with cuda versions 11.7, 11.6 and 11.3 and they all give me the sames errors
thanks in advance
Has anyone received a message like #error -- unsupported GNU version! gcc versions later than 8 are not supported!
I have gcc-12.1.0 installed, is this too new? I'm not entirely sure how to install a gcc version like 7.5.0 as it does not seem to be supported on any conda channels.
did it work? it seems not
Thanks for the investigation! I'm using gcc
5.4.0
, like in the issue that you linked. I will install a higher version of gcc and try again!
Did you finally solve it? I had the same problem as you