extension-cpp
extension-cpp copied to clipboard
Compiler error /cuda/setup.py
Hello,
the compilation of the setup.py in cpp is successful but, for /cuda/setup.py I get the following compile error. Therefore I would like to ask you, if you have an idea what my mistake could be.
Best regards
System:
- OS: Ubuntu 18.04.1 LTS
- PyTorch version: 1.0
- How you installed PyTorch (conda, pip, source): conda
- Python version: 3.6.8
- CUDA/cuDNN version: 10.0
- GPU models and configuration: GeForce GTX 1080 Ti
- GCC version (if compiling from source): 7.3.0
Error log:
rrunning install
running bdist_egg
running egg_info
writing lltm_cuda.egg-info/PKG-INFO
writing dependency_links to lltm_cuda.egg-info/dependency_links.txt
writing top-level names to lltm_cuda.egg-info/top_level.txt
reading manifest file 'lltm_cuda.egg-info/SOURCES.txt'
writing manifest file 'lltm_cuda.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
building 'lltm_cuda' extension
gcc -pthread -B /pizady/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/pizady/anaconda3/lib/python3.6/site-packages/torch/lib/include -I/pizady/anaconda3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -I/pizady/anaconda3/lib/python3.6/site-packages/torch/lib/include/TH -I/pizady/anaconda3/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/pizady/anaconda3/include/python3.6m -c lltm_cuda.cpp -o build/temp.linux-x86_64-3.6/lltm_cuda.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=lltm_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from lltm_cuda.cpp:1:0:
/pizady/anaconda3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include/torch/torch.h:7:2: warning: #warning "Including torch/torch.h for C++ extensions is deprecated. Please include torch/extension.h" [-Wcpp]
#warning \
^~~~~~~
/usr/local/cuda/bin/nvcc -I/pizady/anaconda3/lib/python3.6/site-packages/torch/lib/include -I/pizady/anaconda3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -I/pizady/anaconda3/lib/python3.6/site-packages/torch/lib/include/TH -I/pizady/anaconda3/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/pizady/anaconda3/include/python3.6m -c lltm_cuda_kernel.cu -o build/temp.linux-x86_64-3.6/lltm_cuda_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options '-fPIC' -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=lltm_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
lltm_cuda_kernel.cu(54): error: calling a __host__ function("std::fmax<double, float> ") from a __global__ function("_NV_ANON_NAMESPACE::lltm_cuda_forward_kernel<float> ") is not allowed
lltm_cuda_kernel.cu(54): error: identifier "std::fmax<double, float> " is undefined in device code
2 errors detected in the compilation of "/tmp/tmpxft_00000f0c_00000000-6_lltm_cuda_kernel.cpp1.ii".
I don't think you made any mistake.
So, for the warning:
Please include torch/extension.h
For the error, this has been asked a few times: https://github.com/pytorch/extension-cpp/issues?utf8=%E2%9C%93&q=is%3Aissue+fmax
I think the consensus was this is an environment error, and the best solution is to build PyTorch from source
no, it is because of cuda API. No relevance to Pytorch. just cast the second arg to (double). That's the best solution.
Got the same error here.
Ubuntu 16.04
Cuda 10.0
Pytorch 1.1.0a0+7e73783 (built from source)
python 3.7
although solution from #21 seems to work. Discussion from #15 also hints that casting to scalar_t might actually be the thing to do if numbers are implicitely cast to double.
Normally i would add the (scalar_t) cast and move on, but I wanted to submit a PR (see #31) and cannot build on a clean workspace.
Any hints on what to do ? I actually could build before, (last summer) but since then, I updated my python version, along with cuda (and of course pytorch). I might try on a docker build to have a perfeclty clean install, but if the problem is common enough maybe we can add this cast on fmax (and fmin, everything to scalar_t is better than everything to double)
After some investigations, it seems related to gcc version. Originally tested it in gcc-7 but it didn't work. Changed to gcc-5 with a simple "update alternatives" and now it works. pytorch was compiled from source with gcc-7.
Any idea what might have changed from gcc-5 to gcc-7 ?
I reproduced this on docker today, and fixed the issue with this commit https://github.com/pytorch/extension-cpp/commit/1031028f3b048fdea84372f3b81498db53d64d98
Hi thanks for the commit ! unfortunately, I believe the fminfand fmaxf is implicitely casting everything to float32. As a consequence, the check.py and grad_check.py are now broken with cuda, because the precision is not sufficient for float64 tensors.
Example output :
python check.py forward -c
Forward: Baseline (Python) vs. C++ ... Ok
Forward: Baseline (Python) vs. CUDA ... Traceback (most recent call last):
File "check.py", line 104, in <module>
check_forward(variables, options.cuda, options.verbose)
File "check.py", line 45, in check_forward
check_equal(baseline_values, cuda_values, verbose)
File "check.py", line 22, in check_equal
np.testing.assert_allclose(x, y, err_msg="Index: {}".format(i))
File "/home/cpinard/anaconda3/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 1452, in assert_allclose
verbose=verbose, header=header, equal_nan=equal_nan)
File "/home/cpinard/anaconda3/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 789, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Not equal to tolerance rtol=1e-07, atol=0
Index: 0
(mismatch 13.333333333333329%)
x: array([-1.206306e-04, 9.878260e-01, -2.557970e-01, 3.771263e-01,
-1.863440e-01, 5.914125e-02, 6.415094e-01, 3.132478e-04,
1.672588e-03, -4.412979e-03, -1.300380e-01, -7.609038e-01,
5.438342e-01, 6.241342e-02, -3.342839e-01])
y: array([-1.206305e-04, 9.878260e-01, -2.557970e-01, 3.771263e-01,
-1.863440e-01, 5.914125e-02, 6.415094e-01, 3.132469e-04,
1.672588e-03, -4.412979e-03, -1.300380e-01, -7.609038e-01,
5.438342e-01, 6.241342e-02, -3.342839e-01])
whoops, this is my bad. let me re-setup the environment and see what I can do about this.
@soumith Hi Soumith, do you find the solution for this precision problem? I met this problem in my C++ extension, too.
I also encountered a similar problem. After deleting some paths in the PATH variable that I felt might cause conflicts, I was able to solve it.