GenRe-ShapeHD
GenRe-ShapeHD copied to clipboard
error in projection foward: no kernel image is available for execution on the device
I've been stopped by this issue for several days.
while running test_genre.sh,I got the following error:
Traceback (most recent call last):
File "test.py", line 95, in
Does anyone have solution for that? thanks.
Thank you for making the code available, Xiuming.
I've met the same error in trying to train marrnet with shapenet examples. Is there a solution here?
Hao, did you ever figure this out?
Thanks again, Jeff
@weeoooweeooo would you mind sharing your detailed error message? It seems that I can not reproduce this. I suspect this might be caused by improper install of cuda kernels; I'll update an install script for this.
@ztzhang Thank you for responding so quickly. I'm in the process of trying to install new kernels exactly.
==> Training
Epoch 1/1000
10000/10000 [==============================] - 188s - loss: 1549.6328 - depth: 614.3428 - silhou: 483.5301 - normal: 451.7600 - depth_minmax: 2138.4353
Eval 1/1000
error in projection foward: no kernel image is available for execution on the device
Traceback (most recent call last):
File "train.py", line 216, in
The issues arose originally in trying to create a workaround due to deprecation of torch.utils.ffi in pytorch 1.0, however. I'm using an RTX gpu which requires it and CUDA 10, but don't understand _wrap_function nor create_extension well enough to rewrite those sections. The original errors follow. The solution isn't a drop in replacement, it seems. Do you have any ideas?
==> Parsing arguments
Traceback (most recent call last):
File "train.py", line 18, in
@weeoooweeooo I think your problem is different from OP. @royalhao3zZ I just pushed a fix with clean_toolbox_build.sh and build_toolbox.sh. Would you mind trying clean the previous build first and rebuild the toolbox again? Thanks!
As for @weeoooweeooo, I think I might have a quick fix for that in hand, please stay tuned.
@weeoooweeooo I figured out a quick fix to make it compile. However we would like to keep the original repo consistent so I do not plan to push this to the repo.
Here's what I did:
- copy all .c files as .cpp files.
- for each setup.sh, comment out line 34-42.
- modify the build.py as follows (only showing for calc_prob) :
import os
import sys
import torch
from torch.utils.cpp_extension import CppExtension, BuildExtension, include_paths
this_file = os.path.dirname(os.path.realpath(__file__))
print(this_file)
extra_compile_args = list()
extra_objects = list()
assert(torch.cuda.is_available())
sources = ['calc_prob/src/calc_prob.cpp']
headers = ['calc_prob/src/calc_prob.h']
defines = [('WITH_CUDA', True)]
with_cuda = True
extra_objects = ['calc_prob/src/calc_prob_kernel.cu.o']
extra_objects = [os.path.join(this_file, fname) for fname in extra_objects]
ffi_params = {
#'headers': headers,
'sources': sources,
'define_macros': defines,
#'relative_to': __file__,
#'with_cuda': with_cuda,
'extra_objects': extra_objects,
'include_dirs': [os.path.join(this_file, 'calc_prob/src')] + include_paths(True),
'extra_compile_args': extra_compile_args,
}
if __name__ == '__main__':
ext = CppExtension(
'calc_prob._ext.calc_prob_lib',
# package=False,
**ffi_params)
from setuptools import setup
setup(name='calc_prob', ext_modules=[ext], cmdclass={'build_ext': BuildExtension})
Then you could first run setup.sh to build the .so files and run python build.py build_ext to build the extensions you need. Then you might need to copy or soft link the built _ext in the build folder to(there might be parent folders with your os and python spec), to calc_prob/calc_prob/_ext
@weeoooweeooo would you mind letting us know if this works for you? Thanks!
@ztzhang Thank you so much for helping with me with this specific workaround. I have tried your suggestions, but am now being met with this error calling functions from the newly built extension:
==> Training
Epoch 1/1000
10000/10000 [==============================] - 191s - loss: 1574.3203 - depth: 618.3826 - silhou: 500.1726 - normal: 455.7651 - depth_minmax: 1982.9009
Eval 1/1000
Traceback (most recent call last):
File "train.py", line 213, in
Have tried to troubleshoot a bit. Everything appears smooth, except a warning in compiling:
/srv/git/GenRe-ShapeHD/toolbox/cam_bp running build_ext building 'cam_bp._ext.cam_bp_lib' extension creating build creating build/temp.linux-x86_64-3.6 creating build/temp.linux-x86_64-3.6/cam_bp creating build/temp.linux-x86_64-3.6/cam_bp/src gcc -pthread -B /home/gsq/anaconda3/envs/shaperecon/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA=True -I/srv/git/GenRe-ShapeHD/toolbox/cam_bp/cam_bp/src -I/home/gsq/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/include -I/home/gsq/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/gsq/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/include/TH -I/home/gsq/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/gsq/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/include -I/home/gsq/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/gsq/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/include/TH -I/home/gsq/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/include/THC -I/home/gsq/anaconda3/envs/shaperecon/include/python3.6m -c cam_bp/src/back_projection.cpp -o build/temp.linux-x86_64-3.6/cam_bp/src/back_projection.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=cam_bp_lib -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ creating build/lib.linux-x86_64-3.6 creating build/lib.linux-x86_64-3.6/cam_bp creating build/lib.linux-x86_64-3.6/cam_bp/_ext g++ -pthread -shared -B /home/gsq/anaconda3/envs/shaperecon/compiler_compat -L/home/gsq/anaconda3/envs/shaperecon/lib -Wl,-rpath=/home/gsq/anaconda3/envs/shaperecon/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.6/cam_bp/src/back_projection.o /srv/git/GenRe-ShapeHD/toolbox/cam_bp/cam_bp/src/back_projection_kernel.cu.o -o build/lib.linux-x86_64-3.6/cam_bp/_ext/cam_bp_lib.cpython-36m-x86_64-linux-gnu.so
Hi, after a more careful read into the doc, it seems the build system now relies on pybind11 to expose the cpp functions calls; I'm guessing this is why the error happens. I don't think we need to rewrite everything since the C API is still maintained, but to add pybind to the cpp functions. Sorry I may not have much capacity to fix this issue particularly, but I would suggest adding pybind to the cpp files and see if it works.
Hi @ztzhang, thank you for your guidance about pybind11 to expose the cpp functions. I have been trying to do so. Here, I have modified their example with your build.py script:
from setuptools import setup, Extension from setuptools.command.build_ext import build_ext import sys import setuptools import os import torch from torch.utils.cpp_extension import CppExtension, BuildExtension, include_paths
version = '0.0.1' this_file = os.path.dirname(os.path.realpath(file)) print(this_file)
class get_pybind_include(object):
"""Helper class to determine the pybind11 include path
The purpose of this class is to postpone importing pybind11
until it is actually installed, so that the get_include()
method can be invoked. """
def __init__(self, user=False):
self.user = user
def __str__(self):
import pybind11
return pybind11.get_include(self.user)
extra_compile_args =list() # ['python3 -m pybind11 --includes']
extra_objects = list()
assert(torch.cuda.is_available())
sources = ['cam_bp/src/back_projection.cpp']
headers = ['cam_bp/src/back_projection.h']
defines = [('WITH_CUDA', True)]
with_cuda = True
extra_objects = ['cam_bp/src/back_projection_kernel.cu.o']
extra_objects = [os.path.join(this_file, fname) for fname in extra_objects]
ffi_params = { # 'headers': headers, # 'sources': sources, 'define_macros': defines, # 'relative_to': file, # 'with_cuda': with_cuda, 'extra_objects': extra_objects, 'extra_compile_args': extra_compile_args, }
ext_modules = [ CppExtension( 'cam_bp_lib', ['cam_bp/src/back_projection.cpp'], include_dirs=[ os.path.join(this_file, 'cam_bp/src'), # Path to pybind11 headers get_pybind_include(), get_pybind_include(user=True), '/usr/local/cuda-10.0/targets/x86_64-linux/include'], language='c++', **ffi_params ), ]
As of Python 3.6, CCompiler has a has_flag method.
cf http://bugs.python.org/issue26689
def has_flag(compiler, flagname): """Return a boolean indicating whether a flag name is supported on the specified compiler. """ import tempfile with tempfile.NamedTemporaryFile('w', suffix='.cpp') as f: f.write('int main (int argc, char **argv) { return 0; }') try: compiler.compile([f.name], extra_postargs=[flagname]) except setuptools.distutils.errors.CompileError: return False return True
def cpp_flag(compiler): """Return the -std=c++[11/14/17] compiler flag The newer version is prefered over c++11 (when it is available). """ flags = ['-std=c++17', '-std=c++14', '-std=c++11']
for flag in flags:
if has_flag(compiler, flag): return flag
raise RuntimeError('Unsupported compiler -- at least C++11 support '
'is needed!')
class BuildExt(build_ext): """A custom build extension for adding compiler-specific options.""" c_opts = { 'msvc': ['/EHsc'], 'unix': [], } l_opts = { 'msvc': [], 'unix': [], }
if sys.platform == 'darwin':
darwin_opts = ['-stdlib=libc++', '-mmacosx-version-min=10.7']
c_opts['unix'] += darwin_opts
l_opts['unix'] += darwin_opts
def build_extensions(self):
ct = self.compiler.compiler_type
opts = self.c_opts.get(ct, [])
link_opts = self.l_opts.get(ct, [])
if ct == 'unix':
opts.append('-DVERSION_INFO="%s"' % self.distribution.get_version())
opts.append(cpp_flag(self.compiler))
if has_flag(self.compiler, '-fvisibility=hidden'):
opts.append('-fvisibility=hidden')
elif ct == 'msvc':
opts.append('/DVERSION_INFO=\\"%s\\"' % self.distribution.get_version())
for ext in self.extensions:
ext.extra_compile_args = opts
ext.extra_link_args = link_opts
build_ext.build_extensions(self)
setup( name='cam_bp_lib', version=version, ext_modules=ext_modules, install_requires=['pybind11>=2.3'], setup_requires=['pybind11>=2.3'], cmdclass={'build_ext': BuildExtension}, zip_safe=False, )
Though I am able to expose simpler functions with this setup, I'm unable to get it working for your toolboxes so far unfortunately. Currently, I'm getting this error in trying to import the compiled toolbox:
import cam_bp_lib Traceback (most recent call last): File "
", line 1, in ImportError: /home/gsq/anaconda3/envs/shaperecon/lib/python3.6/site-packages/cam_bp_lib.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZTIN3c1010TensorImplE
I suspect the main issue has to do with sharing the CUDA library .so to compile with the .cpp file. Do you have any insight about this maybe? Should I try compiling the CUDA code with cuda_extension? Or maybe share the library in this manner https://devtalk.nvidia.com/default/topic/759162/shared-library-separate-compilation-c-c-/ ?
@weeoooweeooo I might have some time to look at this ~~during the weekend~~ next week, my guess is that we also need to add extern_c for the wrapper functions.
I'm not sure if it is still cost efficient to hack it tho; I'll try to overhaul some of those kernels to the current c++ api, and I think some of the ops are already included in the pytorch function sets.
That would be very helpful @ztzhang. I'd appreciate any input you might have. Thanks! I'm afraid I've not had much exposure to c++/cuda, but am very interested in trying to use your model with some medical images. Please let me know what I can do.
@weeoooweeooo I think your problem is different from OP. @royalhao3zZ I just pushed a fix with clean_toolbox_build.sh and build_toolbox.sh. Would you mind trying clean the previous build first and rebuild the toolbox again? Thanks!
As for @weeoooweeooo, I think I might have a quick fix for that in hand, please stay tuned.
I am having the same issue as @royalhao3zZ. I ran ./clean_toolbox_build.sh and then ./build_toolbox.sh again, but I'm still getting the same issue when trying to run scripts/test_genre.sh. If you could provide any insight into this error, or any potential fixes, I would really appreciate it! Thank you!
@dannygelman1 would you mind sharing your compile time messages as well as the error messages?
Yes! Thank you for looking into this!
This is everything that prints after I run scripts/test_genre.sh 0
(The zero is to indicate the index of the gpu I want to use. Since my machine only has one gpu, it is at index 0)
==> Parsing arguments
Namespace(adam_beta1=0.5, adam_beta2=0.9, batch_size=1, classes='chair', dataset=None, epoch=0, epoch_batches=None, eval_at_start=False, eval_batches=None, expr_id=0, full_logdir=None, gpu='0', inpaint_path=None, input_mask='./downloads/data/test/genre/*_silhouette.*', input_rgb='./downloads/data/test/genre/*_rgb.*', joint_train=False, load_offline=False, log_batch=False, log_time=False, logdir=None, lr=0.0001, manual_seed=None, net='genre_full_model', net1_path=None, net_file='./downloads/models/full_model.pt', optim='adam', output_dir='./output/test', overwrite=True, padding_margin=16, pred_depth_minmax=True, resume=0, save_net=1, save_net_opt=False, sgd_dampening=0, sgd_momentum=0.9, suffix='{net}', surface_weight=1.0, tensorboard=False, vis_batches_train=10, vis_batches_vali=10, vis_every_train=1, vis_every_vali=1, vis_param_f=None, vis_workers=4, wdecay=0.0, workers=0)
==> Setting device
[Warning] Designated GPU in use: id=0, util=11%, memory in use: 450 MiB
==> Setting up output directory
==> Setting up loggers
==> Setting up models
[Warning] Model loaded without optimizer states.
Testing GenRe
# model parameters: 100,204,619
==> Setting up data loaders
[Verbose] Time spent in data IO initialization: 0.00s
[Verbose] # test points: 4
[Verbose] # test batches: 4
==> Testing
0%| | 0/4 [00:00<?, ?it/s]error in projection foward: no kernel image is available for execution on the device
Traceback (most recent call last):
File "test.py", line 94, in <module>
model.test_on_batch(i, batch)
File "/home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/models/genre_full_model.py", line 182, in test_on_batch
pred = self.forward_with_trimesh(batch)
File "/home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/models/genre_full_model.py", line 207, in forward_with_trimesh
proj = self.net.depth_and_inpaint.proj_depth(pred_abs_depth)
File "/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "/home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/cam_bp/cam_bp/modules/camera_backprojection_module.py", line 22, in forward
df = CameraBackProjection.apply(depth_t, fl, cam_dist, self.res)
File "/home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/cam_bp/cam_bp/functions/cam_back_projection.py", line 25, in forward
cam_bp_lib.back_projection_forward(depth_t, cam_dist, fl, tdf, cnt)
File "/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 202, in safe_call
result = torch._C._safe_call(*args, **kwargs)
torch.FatalError: aborting at /home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/cam_bp/cam_bp/src/back_projection.c:14```
Would you mind cleaning the build and recompile the cuda kernels? And please post the corresponding print so that I can help tracking this down. Thanks.
After I run ./clean_toolbox_build.sh I get the following
Directory calc_prob/__pycache__ removed
Directory calc_prob/_ext removed
Directory calc_prob/functions/__pycache__ removed
File cam_bp/src/back_projection_kernel.cu.o removed
__pycache__ not found
dist not found
build not found
pytorch_camera_back_projection.egg-info not found
.cache not found
Directory cam_bp/__pycache__ removed
Directory cam_bp/_ext removed
Directory cam_bp/functions/__pycache__ removed
Directory cam_bp/modules/__pycache__ removed
Since it is saying build not found, among other files, does that mean I am not creating all the necessary files?
Yes. Can you post the compile messages as well?
On Tue, Oct 15, 2019 at 12:18 PM Danny Gelman [email protected] wrote:
After I run ./clean_toolbox_build.sh I get the following
Directory calc_prob/pycache removed Directory calc_prob/_ext removed Directory calc_prob/functions/pycache removed File cam_bp/src/back_projection_kernel.cu.o removed pycache not found dist not found build not found pytorch_camera_back_projection.egg-info not found .cache not found Directory cam_bp/pycache removed Directory cam_bp/_ext removed Directory cam_bp/functions/pycache removed Directory cam_bp/modules/pycache removed
Since it is saying build not found, among other files, does that mean I am not creating all the necessary files?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/xiumingzhang/GenRe-ShapeHD/issues/16?email_source=notifications&email_token=ADF4WO6BKBSDFSBNAK2WIMLQOXUOLA5CNFSM4HIT2QL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBJL2TA#issuecomment-542293324, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADF4WO35UGCKTRQTLQKRCNTQOXUOLANCNFSM4HIT2QLQ .
Here are all the messages after I run ./build_toolbox.sh
Add -gencode to match all the GPU architectures you have.
Check 'https://en.wikipedia.org/wiki/CUDA#GPUs_supported' for list of architecture.
Check 'http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html' for GPU compilation based on architecture.
/home/guillermo/anaconda3/envs/shaperecon/bin/python
setup.sh: line 9: /home/guillermo/anaconda3/envs/shaperecon/bin:/home/guillermo/anaconda3/condabin:/home/guillermo/.local/bin:/home/guillermo/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/local/cuda-9.2/bin: No such file or directory
nvcc -c -o calc_prob_kernel.cu.o calc_prob_kernel.cu -x cu -Xcompiler -fPIC -I /home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/lib/include -I /home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/lib/include/TH -I /home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/lib/include/THC -I /home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/calc_prob/calc_prob/src -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_61,code=sm_61
/home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/calc_prob
generating /tmp/tmpx30pui_m/_calc_prob_lib.c
setting the current directory to '/tmp/tmpx30pui_m'
running build_ext
building '_calc_prob_lib' extension
creating home
creating home/guillermo
creating home/guillermo/PycharmProjects
creating home/guillermo/PycharmProjects/Fluid_Research
creating home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD
creating home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox
creating home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/calc_prob
creating home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/calc_prob/calc_prob
creating home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/calc_prob/calc_prob/src
gcc -pthread -B /home/guillermo/anaconda3/envs/shaperecon/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA=True -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/calc_prob/calc_prob/src -I/home/guillermo/anaconda3/envs/shaperecon/include/python3.6m -c _calc_prob_lib.c -o ./_calc_prob_lib.o -std=c99
gcc -pthread -B /home/guillermo/anaconda3/envs/shaperecon/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA=True -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/calc_prob/calc_prob/src -I/home/guillermo/anaconda3/envs/shaperecon/include/python3.6m -c /home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/calc_prob/calc_prob/src/calc_prob.c -o ./home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/calc_prob/calc_prob/src/calc_prob.o -std=c99
gcc -pthread -shared -B /home/guillermo/anaconda3/envs/shaperecon/compiler_compat -L/home/guillermo/anaconda3/envs/shaperecon/lib -Wl,-rpath=/home/guillermo/anaconda3/envs/shaperecon/lib -Wl,--no-as-needed -Wl,--sysroot=/ ./_calc_prob_lib.o ./home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/calc_prob/calc_prob/src/calc_prob.o /home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/calc_prob/calc_prob/src/calc_prob_kernel.cu.o -o ./_calc_prob_lib.so
Add -gencode to match all the GPU architectures you have.
Check 'https://en.wikipedia.org/wiki/CUDA#GPUs_supported' for list of architecture.
Check 'http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html' for GPU compilation based on architecture.
/home/guillermo/anaconda3/envs/shaperecon/bin/python
setup.sh: line 17: /home/guillermo/anaconda3/envs/shaperecon/bin:/home/guillermo/anaconda3/condabin:/home/guillermo/.local/bin:/home/guillermo/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/local/cuda-9.2/bin: No such file or directory
nvcc -c -o nnd_cuda.cu.o nnd_cuda.cu -x cu -Xcompiler -fPIC -I /home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/lib/include/TH -I /home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/lib/include/THC -I /home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/nndistance/src -I /home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/lib/include -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_61,code=sm_61
Including CUDA code.
/home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/nndistance
generating /tmp/tmp4vs_ng9i/_my_lib.c
setting the current directory to '/tmp/tmp4vs_ng9i'
running build_ext
building '_my_lib' extension
creating home
creating home/guillermo
creating home/guillermo/PycharmProjects
creating home/guillermo/PycharmProjects/Fluid_Research
creating home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD
creating home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox
creating home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/nndistance
creating home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/nndistance/src
gcc -pthread -B /home/guillermo/anaconda3/envs/shaperecon/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/guillermo/anaconda3/envs/shaperecon/include/python3.6m -c _my_lib.c -o ./_my_lib.o -std=c99
gcc -pthread -B /home/guillermo/anaconda3/envs/shaperecon/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/guillermo/anaconda3/envs/shaperecon/include/python3.6m -c /home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/nndistance/src/my_lib.c -o ./home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/nndistance/src/my_lib.o -std=c99
gcc -pthread -B /home/guillermo/anaconda3/envs/shaperecon/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/guillermo/anaconda3/envs/shaperecon/include/python3.6m -c /home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/nndistance/src/my_lib_cuda.c -o ./home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/nndistance/src/my_lib_cuda.o -std=c99
gcc -pthread -shared -B /home/guillermo/anaconda3/envs/shaperecon/compiler_compat -L/home/guillermo/anaconda3/envs/shaperecon/lib -Wl,-rpath=/home/guillermo/anaconda3/envs/shaperecon/lib -Wl,--no-as-needed -Wl,--sysroot=/ ./_my_lib.o ./home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/nndistance/src/my_lib.o ./home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/nndistance/src/my_lib_cuda.o /home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/nndistance/src/nnd_cuda.cu.o -o ./_my_lib.so
Add -gencode to match all the GPU architectures you have.
Check 'https://en.wikipedia.org/wiki/CUDA#GPUs_supported' for list of architecture.
Check 'http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html' for GPU compilation based on architecture.
/home/guillermo/anaconda3/envs/shaperecon/bin/python
setup.sh: line 9: /home/guillermo/anaconda3/envs/shaperecon/bin:/home/guillermo/anaconda3/condabin:/home/guillermo/.local/bin:/home/guillermo/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/local/cuda-9.2/bin: No such file or directory
nvcc -c -o back_projection_kernel.cu.o back_projection_kernel.cu -x cu -Xcompiler -fPIC -I /home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/lib/include/TH -I /home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/lib/include -I /home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/lib/include/THC -I /home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/cam_bp/cam_bp/src -I /home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/lib/include -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_61,code=sm_61
/home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/cam_bp
/home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/cam_bp
generating /tmp/tmpymjotthd/_cam_bp_lib.c
setting the current directory to '/tmp/tmpymjotthd'
running build_ext
building '_cam_bp_lib' extension
creating home
creating home/guillermo
creating home/guillermo/PycharmProjects
creating home/guillermo/PycharmProjects/Fluid_Research
creating home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD
creating home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox
creating home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/cam_bp
creating home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/cam_bp/cam_bp
creating home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/cam_bp/cam_bp/src
gcc -pthread -B /home/guillermo/anaconda3/envs/shaperecon/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA=True -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/cam_bp/cam_bp/src -I/home/guillermo/anaconda3/envs/shaperecon/include/python3.6m -c _cam_bp_lib.c -o ./_cam_bp_lib.o -std=c99
gcc -pthread -B /home/guillermo/anaconda3/envs/shaperecon/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA=True -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/guillermo/anaconda3/envs/shaperecon/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/cam_bp/cam_bp/src -I/home/guillermo/anaconda3/envs/shaperecon/include/python3.6m -c /home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/cam_bp/cam_bp/src/back_projection.c -o ./home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/cam_bp/cam_bp/src/back_projection.o -std=c99
gcc -pthread -shared -B /home/guillermo/anaconda3/envs/shaperecon/compiler_compat -L/home/guillermo/anaconda3/envs/shaperecon/lib -Wl,-rpath=/home/guillermo/anaconda3/envs/shaperecon/lib -Wl,--no-as-needed -Wl,--sysroot=/ ./_cam_bp_lib.o ./home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/cam_bp/cam_bp/src/back_projection.o /home/guillermo/PycharmProjects/Fluid_Research/GenRe-ShapeHD/toolbox/cam_bp/cam_bp/src/back_projection_kernel.cu.o -o ./_cam_bp_lib.so
Hello,
Thank you for looking into my issue! I just wanted to follow up on this and make sure I provided the messages you wanted to see. Are these the compile messages you wanted?
Also, I am an MIT undergraduate and trying to use this repo as part of my project in the Media Lab. I pass by CSAIL often and was wondering, if you are free, maybe we can meet in person to discuss the issue I am running into?
Thank you!
Sorry for the late reply, happy to chat! I can help with the issue if you can show me your setup as well!
No worries! My supervisor @gbernal and I would be happy to chat with you! You are welcome to come by Fluid Interfaces in the Media Lab so we can show you our setup, or we can come by CSAIL if that's easier for you. What days/times are good for you?
Was there ever a resolution on this? I'm getting the same errors.
@weeoooweeooo I am getting the same errors. Did you get any solution to that?
@colinqian Did not manage to get beyond these errors, despite attempts with suggested workarounds. The deprecations in pytorch 1.0 require some non-trivial changes in the code here it seems.
I can get GenRe running on machines with CUDA 9.2 and pytorch 0.4.1. The key pieces are making sure I add the gpu arch specification to the setup.sh scripts in toolbox/, and setting these environment variables (modify as necessary for your machine):
export CPATH=$CPATH:/usr/local/cuda-9.2/include export PATH=/usr/local/cuda-9.2/bin${PATH:+:${PATH}}$ export LD_LIBRARY_PATH=/usr/local/cuda-9.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
Installing pytorch 0.4.1 is itself non trivial anymore; besides the correct cuda version it requires specific gcc version, but I found installing using conda once I had these to be not too bad.
@wagnew3 It works now. I get it running with CUDA 9.0 and pytorch 0.4.1. I upgraded gcc to the lastest version and add some environment variables. Thank you.
@colinqian Which version of GCC did you happen to update it to? I'm getting the same error, running with CUDA 9.0 and pytorch 0.4.1 as well.