stylegan2-pytorch icon indicating copy to clipboard operation
stylegan2-pytorch copied to clipboard

Errors regarding compilation of FusedLeakyRelu cuda kernels

Open ParthaEth opened this issue 5 years ago • 6 comments

Hi I get the following errors while trying to use the Fused activation.Do any one have an idea why?

Traceback (most recent call last): File "/is/ps2/pghosh/.virtualenvs/gif/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 960, in _build_extension_module check=True) File "/usr/lib/python3.6/subprocess.py", line 438, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "train.py", line 17, in from model import StyledGenerator, Discriminator, TextureSpaceDiscriminator File "/is/cluster/work/pghosh/gif1.0/model.py", line 19, in from my_utils.stylegan2_model import StyledConv File "/is/cluster/work/pghosh/gif1.0/my_utils/stylegan2_model.py", line 11, in from my_utils.op import FusedLeakyReLU, fused_leaky_relu, upfirdn2d File "/is/cluster/work/pghosh/gif1.0/my_utils/op/init.py", line 1, in from .fused_act import FusedLeakyReLU, fused_leaky_relu File "/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_act.py", line 14, in os.path.join(module_path, 'fused_bias_act_kernel.cu'), File "/is/ps2/pghosh/.virtualenvs/gif/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 658, in load is_python_module) File "/is/ps2/pghosh/.virtualenvs/gif/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 827, in jit_compile with_cuda=with_cuda) File "/is/ps2/pghosh/.virtualenvs/gif/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 880, in write_ninja_file_and_build build_extension_module(name, build_directory, verbose) File "/is/ps2/pghosh/.virtualenvs/gif/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 973, in build_extension_module raise RuntimeError(message) RuntimeError: Error building extension 'fused': [1/2] /is/software/nvidia/cuda-10.0/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -isystem /is/ps2/pghosh/.virtualenvs/gif/lib/python3.6/site-packages/torch/include -isystem /is/ps2/pghosh/.virtualenvs/gif/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /is/ps2/pghosh/.virtualenvs/gif/lib/python3.6/site-packages/torch/include/TH -isystem /is/ps2/pghosh/.virtualenvs/gif/lib/python3.6/site-packages/torch/include/THC -isystem /is/software/nvidia/cuda-10.0/include -isystem /is/ps2/pghosh/.virtualenvs/gif/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -std=c++11 -c /is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o FAILED: fused_bias_act_kernel.cuda.o /is/software/nvidia/cuda-10.0/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -isystem /is/ps2/pghosh/.virtualenvs/gif/lib/python3.6/site-packages/torch/include -isystem /is/ps2/pghosh/.virtualenvs/gif/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /is/ps2/pghosh/.virtualenvs/gif/lib/python3.6/site-packages/torch/include/TH -isystem /is/ps2/pghosh/.virtualenvs/gif/lib/python3.6/site-packages/torch/include/THC -isystem /is/software/nvidia/cuda-10.0/include -isystem /is/ps2/pghosh/.virtualenvs/gif/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -std=c++11 -c /is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o /is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: a pointer to a bound function may only be used to call the function

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: type name is not allowed

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: expected an expression

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: a pointer to a bound function may only be used to call the function

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: type name is not allowed

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: expected an expression

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: a pointer to a bound function may only be used to call the function

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: type name is not allowed

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: expected an expression

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: a pointer to a bound function may only be used to call the function

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: type name is not allowed

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: expected an expression

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: a pointer to a bound function may only be used to call the function

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: type name is not allowed

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: expected an expression

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: a pointer to a bound function may only be used to call the function

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: type name is not allowed

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: expected an expression

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: a pointer to a bound function may only be used to call the function

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: type name is not allowed

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: expected an expression

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: a pointer to a bound function may only be used to call the function

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: type name is not allowed

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: expected an expression

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: a pointer to a bound function may only be used to call the function

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: type name is not allowed

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: expected an expression

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: a pointer to a bound function may only be used to call the function

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: type name is not allowed

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: expected an expression

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: a pointer to a bound function may only be used to call the function

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: type name is not allowed

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: expected an expression

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: a pointer to a bound function may only be used to call the function

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: type name is not allowed

/is/cluster/work/pghosh/gif1.0/my_utils/op/fused_bias_act_kernel.cu(79): error: expected an expression

36 errors detected in the compilation of "/tmp/tmpxft_00004c5b_00000000-6_fused_bias_act_kernel.cpp1.ii". ninja: build stopped: subcommand failed.

ParthaEth avatar Feb 23 '20 15:02 ParthaEth

You may need to update your gcc.

rosinality avatar Feb 23 '20 23:02 rosinality

I encounter this issue, too. My platform is torch 1.1, CUDA 9.0, gcc 7.4.0.

AtlantixJJ avatar Feb 28 '20 01:02 AtlantixJJ

@AtlantixJJ Could you try recent pytorch (I have tested with >=1.3) and CUDA 10?

rosinality avatar Feb 28 '20 02:02 rosinality

I found the following while using windows:

  1. Your cuda version must be >= 10.1, otherwise it will not compile
  2. In order to compile it in windows, you must install the visualstudio compiler, and run the visualstudio developer comand line. I am using the cmd with native x64 tools.
  3. If you tried to compile it sometime, failed and is still giving compilation error, I found no other way rather than reinstalling pytorch to clear the compiler cache.

When using cuda >= 10.1, make sure that your code is compatible with pytorch > 1.2, in linux should be easier as you are not limited to use a particular command line env rather than the python environment.

walbermr avatar Apr 16 '20 15:04 walbermr

@walbermr Hmm I think maybe you can remove compiler outputs in torch_extensions temp directories.

rosinality avatar Apr 16 '20 15:04 rosinality

@AtlantixJJ你能试试最近的 pytorch(我用 >=1.3 测试过)和 CUDA 10 吗?

Excuse me, my GPU is NVIDIA rtx3090. I only use cuda11. How can I solve this problem.

pfeducode avatar Aug 16 '22 08:08 pfeducode