stylegan2-pytorch
stylegan2-pytorch copied to clipboard
torch version
Some errors occurred during compiling the code, can you tell us the version of the torch, and other software environment, such as cuda, cudnn, gcc, ninja, re2c. Thank you !
I have tested it on pytorch1.3 + cuda10, it runs successfully
I have used pytorch 1.3.1, CUDA 10.2. It seems like that pytorch version is crucial. (See https://github.com/rosinality/stylegan2-pytorch/issues/1)
@rosinality I installed pytorch 1.3.1,torchvision 0.4.2, cuda10.1, it occurred that "ImportError: /tmp/torch_extensions/fused/fused.so: undefined symbol: _ZN3c1011CPUTensorIdEv". Your torchvision is 0.4.2, right?
Could you retry after remove /tmp/torch_extensions directory?
Sorry, I have no idea to remove /tmp/torch_extensions, and I am not familiar with pytorch-c++ extension. Could you explain more?
I suspect it is trying to use cached binaries even after CUDA updates.
now I have update cuda to 10.2, and add cuda to .bashrc file, but tha same error occurred. So do you have some suggestion? I had better reboot the machine?
I don't think you need to reboot after CUDA updates. Could you post full error logs?
Traceback (most recent call last):
File "train.py", line 20, in
how about your gcc version? my gcc is 5.4, I am hesitating to update to gcc7.3
I'm using gcc 5.4
Did you tried to remove cached binaries in /tmp/torch_extensions? Then could you show me
> ldd /tmp/torch_extensions/fused/fused.so
ldd /tmp/torch_extensions/fused/fused.so linux-vdso.so.1 => (0x00007ffdeb198000) libcudart.so.10.0 => /usr/local/lib/libcudart.so.10.0 (0x00007f24bc54d000) libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f24bc1cb000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f24bbfb5000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f24bbbeb000) /lib64/ld-linux-x86-64.so.2 (0x00007f24bca7c000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f24bb9e7000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f24bb7ca000) librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f24bb5c2000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f24bb2b9000)
Seems like that there are cases that pytorch couldn't resolve CUDA shared libraries. (https://github.com/NVIDIAGameWorks/kaolin/issues/30) But I don't know how you can resolve it. If you use anaconda, maybe you can try to make new virtual envs and try again after install pytorch 1.3 and cudatoolkit 10.1 on new venvs.
you are right, after 'rm -rf /tmp/torch_extensions', the error disappeared. Thank you so much. so this case that pytorch couldn't resolve CUDA shared libraries may be ignored.
------------------ 原始邮件 ------------------ 发件人: "Kim Seonghyeon"<[email protected]>; 发送时间: 2019年12月26日(星期四) 晚上10:55 收件人: "rosinality/stylegan2-pytorch"<[email protected]>; 抄送: "晴子"<[email protected]>;"Author"<[email protected]>; 主题: Re: [rosinality/stylegan2-pytorch] torch version (#5)
Seems like that there are cases that pytorch couldn't resolve CUDA shared libraries. (NVIDIAGameWorks/kaolin#30) But I don't know how you can resolve it. If you use anaconda, maybe you can try to make new virtual envs and try again after install pytorch 1.3 and cudatoolkit 10.1 on new venvs.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
I have the same problem...But I was unable to solve this problem by removing /tmp/torch_extensions. Did you do anything else to solve this problem? @qingzi02010

No, I used the commended version of torch, once operating 'rm -rf /tmp/torch_extensions', "ImportError: /tmp/torch_extensions/fused/fused.so: undefined symbol: _ZN3c1011CPUTensorIdEv" disappeared.
I am using python3.7 of anaconda. I don't know whether there is any relations between the problem and python. You can try.
------------------ 原始邮件 ------------------ 发件人: "wosecz"<[email protected]>; 发送时间: 2020年1月6日(星期一) 下午2:51 收件人: "rosinality/stylegan2-pytorch"<[email protected]>; 抄送: "晴子"<[email protected]>;"Mention"<[email protected]>; 主题: Re: [rosinality/stylegan2-pytorch] torch version (#5)
I have the same problem...But I was unable to solve this problem by removing /tmp/torch_extensions. Did you do anything else to solve this problem? @qingzi02010
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
Yes this method is correct. I tried several times and fix this problem. (But got another problem......) Thank you for your reply!
https://www.cnblogs.com/rainsoul/p/12162779.html I do not know what the problem is, you can refer to and try this method.
I have used pytorch 1.3.1, CUDA 10.2. It seems like that pytorch version is crucial. (See #1)
Does anyone else which tensorflow version to use? Because neither tf 1.14 or 1.15 (see original stylegan2 repo) are compatible with CUDA 10.2
@kevinstan I use tf 1.15 on CUDA 10.2. It seems it can run on it.
something weird happens to me. when I try to train it from screen
ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /tmp/torch_extensions/fused/fused.so)
I tried removing /tmp/torchextensions but no luck!
@rosinality I installed pytorch 1.3.1,torchvision 0.4.2, cuda10.1, it occurred that "ImportError: /tmp/torch_extensions/fused/fused.so: undefined symbol: _ZN3c1011CPUTensorIdEv". Your torchvision is 0.4.2, right?
I face the same issue, did this resolve, if yes how ? Could you please pass .yml file of conda env ?
@Harsha-Musunuri could you resolve this issue?
I face the same problem, tensorflow1.14 is not compatible with CUDA10.2. Also, pytorch1.3 is not compatible with gcc>5 and CUDA10.2. But, the convert_weight.py code requires gcc>5 and CUDA10.2.
Do you have any .yml file of conda env which is compatible with all the versions of required libraries?
I have used pytorch 1.3.1, CUDA 10.2. It seems like that pytorch version is crucial. (See #1)
@rosinality pytorch 1.3 is not compatible with CUDA 10.2, did you install it locally and build PyTorch from source?
@denabazazian I don't remember the environments well. You can use recent version of pytorch.
@rosinality I installed pytorch 1.3.1,torchvision 0.4.2, cuda10.1, it occurred that "ImportError: /tmp/torch_extensions/fused/fused.so: undefined symbol: _ZN3c1011CPUTensorIdEv". Your torchvision is 0.4.2, right?
I face the same issue, did this resolve, if yes how ? Could you please pass .yml file of conda env ?
@denabazazian try this https://drive.google.com/file/d/1EaYl5IP0gBqjagX9mZfXr88l13eUzKay/view?usp=sharing
I tried the conda env file to no avail. I'm using cuda 10.1 with pytorch 1.7.1. I failed to downgrade this to 1.3.1. I tried other pytorch versions but ran into other problems which when resolved ended back to this state:
CalledProcessError Traceback (most recent call last) ~/miniconda3/envs/dG/lib/python3.9/site-packages/torch/utils/cpp_extension.py in _run_ninja_build(build_directory, verbose, error_prefix) 1532 stdout_fileno = 1 -> 1533 subprocess.run( 1534 command,
~/miniconda3/envs/dG/lib/python3.9/subprocess.py in run(input, capture_output, timeout, check, *popenargs, **kwargs) 527 if check and retcode: --> 528 raise CalledProcessError(retcode, process.args, 529 output=stdout, stderr=stderr)
CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
RuntimeError Traceback (most recent call last)
~/Documents/dG/alias-free-gan-pytorch/train.py in
~/Documents/dG/alias-free-gan-pytorch/stylegan2/op/init.py in
~/Documents/dG/alias-free-gan-pytorch/stylegan2/op/fused_act.py in
~/miniconda3/envs/dG/lib/python3.9/site-packages/torch/utils/cpp_extension.py in load(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, keep_intermediates) 984 verbose=True) 985 ''' --> 986 return _jit_compile( 987 name, 988 [sources] if isinstance(sources, str) else sources,
~/miniconda3/envs/dG/lib/python3.9/site-packages/torch/utils/cpp_extension.py in _jit_compile(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, keep_intermediates) 1191 clean_ctx=clean_ctx 1192 ) -> 1193 _write_ninja_file_and_build_library( 1194 name=name, 1195 sources=sources,
~/miniconda3/envs/dG/lib/python3.9/site-packages/torch/utils/cpp_extension.py in _write_ninja_file_and_build_library(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda) 1295 if verbose: 1296 print('Building extension module {}...'.format(name)) -> 1297 _run_ninja_build( 1298 build_directory, 1299 verbose,
~/miniconda3/envs/dG/lib/python3.9/site-packages/torch/utils/cpp_extension.py in _run_ninja_build(build_directory, verbose, error_prefix) 1553 if hasattr(error, 'output') and error.output: # type: ignore 1554 message += ": {}".format(error.output.decode()) # type: ignore -> 1555 raise RuntimeError(message) from e 1556 1557
RuntimeError: Error building extension 'fused': [1/2] /usr/local/cuda-10.1/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/mr/miniconda3/envs/dG/lib/python3.9/site-packages/torch/include -isystem /home/mr/miniconda3/envs/dG/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mr/miniconda3/envs/dG/lib/python3.9/site-packages/torch/include/TH -isystem /home/mr/miniconda3/envs/dG/lib/python3.9/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /home/mr/miniconda3/envs/dG/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -std=c++14 -c /home/mr/Documents/dG/alias-free-gan-pytorch/stylegan2/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
FAILED: fused_bias_act_kernel.cuda.o
/usr/local/cuda-10.1/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/mr/miniconda3/envs/dG/lib/python3.9/site-packages/torch/include -isystem /home/mr/miniconda3/envs/dG/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mr/miniconda3/envs/dG/lib/python3.9/site-packages/torch/include/TH -isystem /home/mr/miniconda3/envs/dG/lib/python3.9/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /home/mr/miniconda3/envs/dG/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -std=c++14 -c /home/mr/Documents/dG/alias-free-gan-pytorch/stylegan2/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
In file included from /usr/local/cuda-10.1/include/cuda_runtime.h:83,
from
@MHRosenberg It is not pytorch version problem, but cuda build environment. You can check you can build cuda programs, or use https://github.com/rosinality/alias-free-gan-pytorch/blob/main/Dockerfile.
Hi, I was working on SAM code and I am getting error in imports: ImportError: /root/.cache/torch_extensions/fused/fused.so: cannot open shared object file: No such file or directory I am getting error after running from models.psp import pSp I am running on deepnote. Could you please help me with this error?