e4s icon indicating copy to clipboard operation
e4s copied to clipboard

Getting Error: Maybe a CUDA Version + GCC Version Conflict?

Open nonlin opened this issue 1 year ago • 5 comments

What CUDA version and GCC version should I be using?

Testing inside Ubuntu latest with CUda 11.5 and GCC 11.3

Getting the following.

`raceback (most recent call last): File "/home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1808, in _run_ninja_build subprocess.run( File "/home/g/anaconda3/envs/e4s/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "scripts/face_swap.py", line 16, in from src.pretrained.gpen.gpen_demo import init_gpen_pretrained_model, GPEN_demo File "/media/g/PCXtend/1TB/AI/e4s/src/pretrained/gpen/gpen_demo.py", line 15, in from src.pretrained.gpen.face_enhancement import FaceEnhancement File "/media/g/PCXtend/1TB/AI/e4s/src/pretrained/gpen/face_enhancement.py", line 11, in from src.pretrained.gpen.face_model.face_gan import FaceGAN File "/media/g/PCXtend/1TB/AI/e4s/src/pretrained/gpen/face_model/face_gan.py", line 14, in from src.pretrained.gpen.face_model.gpen_model import FullGenerator, FullGenerator_SR File "/media/g/PCXtend/1TB/AI/e4s/src/pretrained/gpen/face_model/gpen_model.py", line 16, in from src.pretrained.gpen.face_model.op import FusedLeakyReLU, fused_leaky_relu, upfirdn2d File "/media/g/PCXtend/1TB/AI/e4s/src/pretrained/gpen/face_model/op/init.py", line 1, in from .fused_act import FusedLeakyReLU, fused_leaky_relu File "/media/g/PCXtend/1TB/AI/e4s/src/pretrained/gpen/face_model/op/fused_act.py", line 13, in fused = load( File "/home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1202, in load return jit_compile( File "/home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1425, in jit_compile write_ninja_file_and_build_library( File "/home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1537, in write_ninja_file_and_build_library run_ninja_build( File "/home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1824, in run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension 'fused': [1/2] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/include -isystem /home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/include/TH -isystem /home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/include/THC -isystem /home/g/anaconda3/envs/e4s/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -std=c++14 -c /media/g/PCXtend/1TB/AI/e4s/src/pretrained/gpen/face_model/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o FAILED: fused_bias_act_kernel.cuda.o /usr/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/include -isystem /home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/include/TH -isystem /home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/include/THC -isystem /home/g/anaconda3/envs/e4s/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -std=c++14 -c /media/g/PCXtend/1TB/AI/e4s/src/pretrained/gpen/face_model/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o /usr/include/c++/11/bits/std_function.h:435:145: error: parameter packs not expanded with ‘...’: 435 | function(_Functor&& __f) | ^ /usr/include/c++/11/bits/std_function.h:435:145: note: ‘_ArgTypes’ /usr/include/c++/11/bits/std_function.h:530:146: error: parameter packs not expanded with ‘...’: 530 | operator=(_Functor&& __f) | ^ /usr/include/c++/11/bits/std_function.h:530:146: note: ‘_ArgTypes’ ninja: build stopped: subcommand failed. `

nonlin avatar Apr 30 '23 15:04 nonlin

Seems like a ninja related error. My env is CUDA 11.3 + GCC 7.5.0

e4s2022 avatar May 03 '23 10:05 e4s2022

Did you find any solution @nonlin ?

aylinSyntonym avatar May 18 '23 11:05 aylinSyntonym

What CUDA version and GCC version should I be using?

Testing inside Ubuntu latest with CUda 11.5 and GCC 11.3

Getting the following.

`raceback (most recent call last): File "/home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1808, in _run_ninja_build subprocess.run( File "/home/g/anaconda3/envs/e4s/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "scripts/face_swap.py", line 16, in from src.pretrained.gpen.gpen_demo import init_gpen_pretrained_model, GPEN_demo File "/media/g/PCXtend/1TB/AI/e4s/src/pretrained/gpen/gpen_demo.py", line 15, in from src.pretrained.gpen.face_enhancement import FaceEnhancement File "/media/g/PCXtend/1TB/AI/e4s/src/pretrained/gpen/face_enhancement.py", line 11, in from src.pretrained.gpen.face_model.face_gan import FaceGAN File "/media/g/PCXtend/1TB/AI/e4s/src/pretrained/gpen/face_model/face_gan.py", line 14, in from src.pretrained.gpen.face_model.gpen_model import FullGenerator, FullGenerator_SR File "/media/g/PCXtend/1TB/AI/e4s/src/pretrained/gpen/face_model/gpen_model.py", line 16, in from src.pretrained.gpen.face_model.op import FusedLeakyReLU, fused_leaky_relu, upfirdn2d File "/media/g/PCXtend/1TB/AI/e4s/src/pretrained/gpen/face_model/op/init.py", line 1, in from .fused_act import FusedLeakyReLU, fused_leaky_relu File "/media/g/PCXtend/1TB/AI/e4s/src/pretrained/gpen/face_model/op/fused_act.py", line 13, in fused = load( File "/home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1202, in load return jit_compile( File "/home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1425, in jit_compile write_ninja_file_and_build_library( File "/home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1537, in write_ninja_file_and_build_library run_ninja_build( File "/home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1824, in run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension 'fused': [1/2] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/include -isystem /home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/include/TH -isystem /home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/include/THC -isystem /home/g/anaconda3/envs/e4s/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -std=c++14 -c /media/g/PCXtend/1TB/AI/e4s/src/pretrained/gpen/face_model/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o FAILED: fused_bias_act_kernel.cuda.o /usr/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/include -isystem /home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/include/TH -isystem /home/g/anaconda3/envs/e4s/lib/python3.8/site-packages/torch/include/THC -isystem /home/g/anaconda3/envs/e4s/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -std=c++14 -c /media/g/PCXtend/1TB/AI/e4s/src/pretrained/gpen/face_model/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o /usr/include/c++/11/bits/std_function.h:435:145: error: parameter packs not expanded with ‘...’: 435 | function(_Functor&& __f) | ^ /usr/include/c++/11/bits/std_function.h:435:145: note: ‘_ArgTypes’ /usr/include/c++/11/bits/std_function.h:530:146: error: parameter packs not expanded with ‘...’: 530 | operator=(_Functor&& __f) | ^ /usr/include/c++/11/bits/std_function.h:530:146: note: ‘_ArgTypes’ ninja: build stopped: subcommand failed. `

Getting the exact same error.....................................

ketakachono avatar May 21 '23 04:05 ketakachono

I'm doing a fresh install of Ubuntu using Ubuntu 20.04 as you mentioned you tested with it, but getting CUDA 11.3 + GCC 7.5.0 to install together is becoming a pain, can you give me a list of commands you used in the correct order

Inferencer avatar May 25 '23 22:05 Inferencer

This is a known bug with CUDA 11.5 and GCC 11.3. If you downgrade to GCC 10, it may work. Install gcc-10 and g++-10: sudo apt install g++-10 gcc-10 Then switch to v10: sudo ls -la /usr/bin/ | grep -oP "[\S]*(gcc|g\+\+)(-[a-z]+)*[\s]" | xargs bash -c 'for link in ${@:1}; do sudo ln -s -f "/usr/bin/${link}-${0}" "/usr/bin/${link}"; done' 10

However installing cuda toolkit 12.2 from nvidia will solve all the problems.

colt18 avatar Jul 01 '23 19:07 colt18