PreciseRoIPooling icon indicating copy to clipboard operation
PreciseRoIPooling copied to clipboard

The support for 3080 or 3090

Open EraProphet opened this issue 4 years ago • 21 comments

Hi! I have got a 3090 GPU. However, I find some problems when I compile DCN The system is Ubuntu 18.04, the version of PyTorch is 1.7.0. the problem is nvcc fatal : Unsupported gpu architecture 'compute_86' I do not know how to do it?

EraProphet avatar Nov 18 '20 02:11 EraProphet

I believe this is a pytorch compatibility issue. It seems that the pytorch on-the-fly compilation module needs to be updated. Sorry since I don't have access to a 3080 gpu, I can't test it myself...

vacancy avatar Nov 20 '20 21:11 vacancy

I also encountered this problem. Has it been solved now

xiaolin13 avatar Dec 15 '20 12:12 xiaolin13

Can anyone confirm the support of 3080 GPUs in the latest PyTorch version? If this module still fails to compile, can anyone upload their error logs? It will also be helpful if anyone can provide the ninja.build file automatically generated by the PyTorch library.

Please directly reply in this thread and I am happy to help.

vacancy avatar Dec 16 '20 19:12 vacancy

Can anyone confirm the support of 3080 GPUs in the latest PyTorch version? If this module still fails to compile, can anyone upload their error logs? It will also be helpful if anyone can provide the ninja.build file automatically generated by the PyTorch library.

Please directly reply in this thread and I am happy to help.

image

I use Pytorch 1.7.0 and GeForce RTX 3090.

ReedZyd avatar Dec 25 '20 09:12 ReedZyd

Can you go to the actual ninja temporary folder and run ninja and copy-paste the relevant logs?

vacancy avatar Dec 28 '20 02:12 vacancy

image

ReedZyd avatar Dec 29 '20 06:12 ReedZyd

It might be due to the pytorch version. How can I use this library with Pytorch 1.7.0.?

ReedZyd avatar Jan 18 '21 04:01 ReedZyd

It might be due to the pytorch version. How can I use this library with Pytorch 1.7.0.?

Can youuse this library with Pytorch 1.7.0 now ?

wenhaixi avatar Apr 18 '21 05:04 wenhaixi

I am fine with using this library with PyTorch 1.7 on Titan X, Titan Xp, and Titan RTX. But I can't test it with 3080. If you encounter any problem using 30X0, please check your CUDA installation (to make sure nvcc is up-to-date for compiling 30X0 cuda files). If problem still remains, please provide the detail compilation log, not only the Python ImportError messages, but the full running log, including those warnings/errors/failures produced by ninja/g++/nvcc.

vacancy avatar Apr 28 '21 18:04 vacancy

I am fine with using this library with PyTorch 1.7 on Titan X, Titan Xp, and Titan RTX. But I can't test it with 3080. If you encounter any problem using 30X0, please check your CUDA installation (to make sure nvcc is up-to-date for compiling 30X0 cuda files). If problem still remains, please provide the detail compilation log, not only the Python ImportError messages, but the full running log, including those warnings/errors/failures produced by ninja/g++/nvcc.

I use 3090 pytorch1.7.1 and cuda-11.2,and met this error:

Using /home/yy/.cache/torch_extensions as PyTorch extensions root... Creating extension directory /home/yy/.cache/torch_extensions/_prroi_pooling... Detected CUDA files, patching ldflags Emitting ninja build file /home/yy/.cache/torch_extensions/_prroi_pooling/build.ninja... Building extension module _prroi_pooling... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) ninja: error: build.ninja:3: bad $-escape (literal $ must be written as $$) nvcc = $/usr/local/cuda-11.2/bin/nvcc ^ near here Traceback (most recent call last): File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1533, in _run_ninja_build subprocess.run( File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/yy/data/TransformerTrack-new/pytracking/GOT10k_GOT.py", line 49, in experiment.run(tracker, visualize=False) File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/got10k/experiments/got10k.py", line 80, in run boxes, times = tracker.track( File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/got10k/trackers/init.py", line 35, in track self.init(image, box) File "/home/yy/data/TransformerTrack-new/pytracking/GOT10k_GOT.py", line 34, in init self.tracker.initialize(image, box) File "/home/yy/data/TransformerTrack-new/pytracking/tracker/trdimp/trdimp_for_GOT.py", line 89, in initialize self.init_classifier(init_backbone_feat) File "/home/yy/data/TransformerTrack-new/pytracking/tracker/trdimp/trdimp_for_GOT.py", line 625, in init_classifier self.target_filter, _, losses = self.net.classifier.get_filter(x, target_boxes, num_iter=num_iter, File "/home/yy/data/TransformerTrack-new/ltr/models/target_classifier/linear_filter.py", line 114, in get_filter weights = self.filter_initializer(feat, bb) File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/yy/data/TransformerTrack-new/ltr/models/target_classifier/initializer.py", line 164, in forward weights = self.filter_pool(feat, bb) File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/yy/data/TransformerTrack-new/ltr/models/target_classifier/initializer.py", line 45, in forward return self.prroi_pool(feat, roi1) File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/yy/data/TransformerTrack-new/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/prroi_pool.py", line 28, in forward return prroi_pool2d(features, rois, self.pooled_height, self.pooled_width, self.spatial_scale) File "/home/yy/data/TransformerTrack-new/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/functional.py", line 44, in forward _prroi_pooling = _import_prroi_pooling() File "/home/yy/data/TransformerTrack-new/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/functional.py", line 30, in _import_prroi_pooling _prroi_pooling = load_extension( File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 986, in load return _jit_compile( File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1193, in _jit_compile _write_ninja_file_and_build_library( File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1297, in _write_ninja_file_and_build_library _run_ninja_build( File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1555, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension '_prroi_pooling'

yuyudiandian avatar Jul 09 '21 08:07 yuyudiandian

I am fine with using this library with PyTorch 1.7 on Titan X, Titan Xp, and Titan RTX. But I can't test it with 3080. If you encounter any problem using 30X0, please check your CUDA installation (to make sure nvcc is up-to-date for compiling 30X0 cuda files). If problem still remains, please provide the detail compilation log, not only the Python ImportError messages, but the full running log, including those warnings/errors/failures produced by ninja/g++/nvcc.

I use 3090 pytorch1.7.1 and cuda-11.2,and met this error:

Using /home/yy/.cache/torch_extensions as PyTorch extensions root... Creating extension directory /home/yy/.cache/torch_extensions/_prroi_pooling... Detected CUDA files, patching ldflags Emitting ninja build file /home/yy/.cache/torch_extensions/_prroi_pooling/build.ninja... Building extension module _prroi_pooling... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) ninja: error: build.ninja:3: bad $-escape (literal $ must be written as $$) nvcc = $/usr/local/cuda-11.2/bin/nvcc ^ near here Traceback (most recent call last): File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1533, in _run_ninja_build subprocess.run( File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/yy/data/TransformerTrack-new/pytracking/GOT10k_GOT.py", line 49, in experiment.run(tracker, visualize=False) File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/got10k/experiments/got10k.py", line 80, in run boxes, times = tracker.track( File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/got10k/trackers/init.py", line 35, in track self.init(image, box) File "/home/yy/data/TransformerTrack-new/pytracking/GOT10k_GOT.py", line 34, in init self.tracker.initialize(image, box) File "/home/yy/data/TransformerTrack-new/pytracking/tracker/trdimp/trdimp_for_GOT.py", line 89, in initialize self.init_classifier(init_backbone_feat) File "/home/yy/data/TransformerTrack-new/pytracking/tracker/trdimp/trdimp_for_GOT.py", line 625, in init_classifier self.target_filter, _, losses = self.net.classifier.get_filter(x, target_boxes, num_iter=num_iter, File "/home/yy/data/TransformerTrack-new/ltr/models/target_classifier/linear_filter.py", line 114, in get_filter weights = self.filter_initializer(feat, bb) File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/yy/data/TransformerTrack-new/ltr/models/target_classifier/initializer.py", line 164, in forward weights = self.filter_pool(feat, bb) File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/yy/data/TransformerTrack-new/ltr/models/target_classifier/initializer.py", line 45, in forward return self.prroi_pool(feat, roi1) File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/yy/data/TransformerTrack-new/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/prroi_pool.py", line 28, in forward return prroi_pool2d(features, rois, self.pooled_height, self.pooled_width, self.spatial_scale) File "/home/yy/data/TransformerTrack-new/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/functional.py", line 44, in forward _prroi_pooling = _import_prroi_pooling() File "/home/yy/data/TransformerTrack-new/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/functional.py", line 30, in _import_prroi_pooling _prroi_pooling = load_extension( File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 986, in load return _jit_compile( File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1193, in _jit_compile _write_ninja_file_and_build_library( File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1297, in _write_ninja_file_and_build_library _run_ninja_build( File "/home/yy/anaconda3/envs/pytracking/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1555, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension '_prroi_pooling'

this question i have already soloved,but another question has happened:Using /home/yy/.cache/torch_extensions as PyTorch extensions root... Detected CUDA files, patching ldflags Emitting ninja build file /home/yy/.cache/torch_extensions/prroi_pooling/build.ninja... Building extension module prroi_pooling... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/3] /home/yy/tools/cuda-9.0:/usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output prroi_pooling_gpu_impl.cuda.o.d -DTORCH_EXTENSION_NAME=prroi_pooling -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include -isystem /home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include/TH -isystem /home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include/THC -isystem /home/yy/tools/cuda-9.0:/usr/local/cuda/include -isystem /home/yry/data/anaconda3/envs/transyry/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -std=c++14 -c /home/yy/data/TransformerTrack/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu_impl.cu -o prroi_pooling_gpu_impl.cuda.o FAILED: prroi_pooling_gpu_impl.cuda.o /home/yy/tools/cuda-9.0:/usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output prroi_pooling_gpu_impl.cuda.o.d -DTORCH_EXTENSION_NAME=prroi_pooling -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include -isystem /home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include/TH -isystem /home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include/THC -isystem /home/yy/tools/cuda-9.0:/usr/local/cuda/include -isystem /home/yry/data/anaconda3/envs/transyry/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -std=c++14 -c /home/yy/data/TransformerTrack/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu_impl.cu -o prroi_pooling_gpu_impl.cuda.o /bin/sh: 1: /home/yy/tools/cuda-9.0:/usr/local/cuda/bin/nvcc: not found [2/3] c++ -MMD -MF prroi_pooling_gpu.o.d -DTORCH_EXTENSION_NAME=_prroi_pooling -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include -isystem /home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include/TH -isystem /home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include/THC -isystem /home/yy/tools/cuda-9.0:/usr/local/cuda/include -isystem /home/yry/data/anaconda3/envs/transyry/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -c /home/yy/data/TransformerTrack/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu.c -o prroi_pooling_gpu.o FAILED: prroi_pooling_gpu.o c++ -MMD -MF prroi_pooling_gpu.o.d -DTORCH_EXTENSION_NAME=_prroi_pooling -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include -isystem /home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include/TH -isystem /home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include/THC -isystem /home/yy/tools/cuda-9.0:/usr/local/cuda/include -isystem /home/yry/data/anaconda3/envs/transyry/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -c /home/yy/data/TransformerTrack/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu.c -o prroi_pooling_gpu.o In file included from /home/yy/data/TransformerTrack/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu.c:15:0: /home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include/ATen/cuda/CUDAContext.h:5:10: fatal error: cuda_runtime_api.h: No such file or directory #include <cuda_runtime_api.h> ^~~~~~~~~~~~~~~~~~~~ compilation terminated. ninja: build stopped: subcommand failed. Traceback (most recent call last): File "/home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1673, in _run_ninja_build env=env) File "/home/yry/data/anaconda3/envs/transyry/lib/python3.7/subprocess.py", line 512, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/yy/data/TransformerTrack/pytracking/GOT10k_GOT.py", line 46, in experiment.run(tracker, visualize=False) File "/home/yy/.local/lib/python3.7/site-packages/got10k/experiments/got10k.py", line 81, in run img_files, anno[0, :], visualize=visualize) File "/home/yy/.local/lib/python3.7/site-packages/got10k/trackers/init.py", line 35, in track self.init(image, box) File "/home/yy/data/TransformerTrack/pytracking/GOT10k_GOT.py", line 32, in init self.tracker.initialize(image, box) File "/home/yy/data/TransformerTrack/pytracking/tracker/trdimp/trdimp_for_GOT.py", line 89, in initialize self.init_classifier(init_backbone_feat) File "/home/yy/data/TransformerTrack/pytracking/tracker/trdimp/trdimp_for_GOT.py", line 619, in init_classifier compute_losses=plot_loss) File "/home/yy/data/TransformerTrack/ltr/models/target_classifier/linear_filter.py", line 114, in get_filter weights = self.filter_initializer(feat, bb) File "/home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/yy/data/TransformerTrack/ltr/models/target_classifier/initializer.py", line 164, in forward weights = self.filter_pool(feat, bb) File "/home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/yy/data/TransformerTrack/ltr/models/target_classifier/initializer.py", line 45, in forward return self.prroi_pool(feat, roi1) File "/home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/yy/data/TransformerTrack/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/prroi_pool.py", line 28, in forward return prroi_pool2d(features, rois, self.pooled_height, self.pooled_width, self.spatial_scale) File "/home/yy/data/TransformerTrack/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/functional.py", line 44, in forward _prroi_pooling = _import_prroi_pooling() File "/home/yy/data/TransformerTrack/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/functional.py", line 33, in _import_prroi_pooling verbose=True File "/home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1091, in load keep_intermediates=keep_intermediates) File "/home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1302, in _jit_compile is_standalone=is_standalone) File "/home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1407, in _write_ninja_file_and_build_library error_prefix=f"Error building extension '{name}'") File "/home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1683, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension '_prroi_pooling'

yuyudiandian avatar Jul 09 '21 11:07 yuyudiandian

@yuyudiandian

/home/yry/data/anaconda3/envs/transyry/lib/python3.7/site-packages/torch/include/ATen/cuda/CUDAContext.h:5:10: fatal error: cuda_runtime_api.h: No such file or directory #include <cuda_runtime_api.h> ^~~~~~~~~~~~~~~~~~~~

There's an issue with your CUDA installation/configuration.

vacancy avatar Jul 09 '21 16:07 vacancy

I also encounter some errors on my 3090 with cuda11.1 and pytorch1.8, here are my logs:

Using /home/ubntun/.cache/torch_extensions as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/ubntun/.cache/torch_extensions/_prroi_pooling/build.ninja...
Building extension module _prroi_pooling...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/3] :/usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output prroi_pooling_gpu_impl.cuda.o.d -DTORCH_EXTENSION_NAME=_prroi_pooling -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include -isystem /home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include/TH -isystem /home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include/THC -isystem :/usr/local/cuda/include -isystem /home/ubntun/anaconda3/envs/lsm/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -std=c++14 -c /media/ubntun/4T-1/lsm/xxx/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu_impl.cu -o prroi_pooling_gpu_impl.cuda.o 
FAILED: prroi_pooling_gpu_impl.cuda.o 
:/usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output prroi_pooling_gpu_impl.cuda.o.d -DTORCH_EXTENSION_NAME=_prroi_pooling -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include -isystem /home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include/TH -isystem /home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include/THC -isystem :/usr/local/cuda/include -isystem /home/ubntun/anaconda3/envs/lsm/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -std=c++14 -c /media/ubntun/4T-1/lsm/xxx/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu_impl.cu -o prroi_pooling_gpu_impl.cuda.o 
/bin/sh: 1: :/usr/local/cuda/bin/nvcc: not found
[2/3] c++ -MMD -MF prroi_pooling_gpu.o.d -DTORCH_EXTENSION_NAME=_prroi_pooling -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include -isystem /home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include/TH -isystem /home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include/THC -isystem :/usr/local/cuda/include -isystem /home/ubntun/anaconda3/envs/lsm/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -c /media/ubntun/4T-1/lsm/xxx/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu.c -o prroi_pooling_gpu.o 
FAILED: prroi_pooling_gpu.o 
c++ -MMD -MF prroi_pooling_gpu.o.d -DTORCH_EXTENSION_NAME=_prroi_pooling -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include -isystem /home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include/TH -isystem /home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include/THC -isystem :/usr/local/cuda/include -isystem /home/ubntun/anaconda3/envs/lsm/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -c /media/ubntun/4T-1/lsm/xxx/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu.c -o prroi_pooling_gpu.o 
In file included from /media/ubntun/4T-1/lsm/xxx/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu.c:15:
/home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include/ATen/cuda/CUDAContext.h:5:10: fatal error: cuda_runtime_api.h: 

laisimiao avatar Apr 22 '22 15:04 laisimiao

@laisimiao Seems that your log got trimmed.

/home/ubntun/anaconda3/envs/lsm/lib/python3.7/site-packages/torch/include/ATen/cuda/CUDAContext.h:5:10: fatal error: cuda_runtime_api.h:

What's the error?

vacancy avatar Apr 22 '22 16:04 vacancy

No, it's all log I can get

laisimiao avatar Apr 22 '22 16:04 laisimiao

@vacancy I have solved it by editing prroi_pooling_gpu.c one line: https://github.com/vacancy/PreciseRoIPooling/blob/cf104012192337f3c3f004ff2f11baabe1368a4d/pytorch/prroi_pool/src/prroi_pooling_gpu.c#L36-L40

into this:
PrRoIPoolingForwardGpu(
        stream, features.data_ptr<float>(), rois.data_ptr<float>(), output.data_ptr<float>(),
        nr_channels, height, width, pooled_height, pooled_width, spatial_scale,
        top_count
    );

laisimiao avatar Apr 24 '22 06:04 laisimiao

That's interesting... I think this is only a deprecated warning instead of an error. https://pytorch.org/cppdocs/api/classat_1_1_tensor.html#_CPPv4NK2at6Tensor9toBackendE7Backend

template<typename T> C10_DEPRECATED_MESSAGE ("Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead.") T *data() const

I don't know why this causes error on your side. But good to hear that you have resolved the issue!

vacancy avatar Apr 24 '22 14:04 vacancy

The PyTorch (after version 1.5) remove the THC/THC.h, and 3090 only support PyTorch 1.7+. After removing all the lines relating to THC/THCudaCheck, it can compile successful. The configuration is, Cuda11.3 + Nvidia Driver 470 + Torch1.10.0.

ReedZyd avatar Apr 28 '22 09:04 ReedZyd

Thanks @ReedZyd I see. I haven't been tracking this project and PyTorch updates recently. I can try to take a look when I get some time... Thanks a ton for the pointers!

vacancy avatar Apr 28 '22 18:04 vacancy

@ReedZyd Interesting. I tried the latest installation of PyTorch 1.10. And I am able to compile the library. And the THC files are successfully found. On my server, the path is:

anaconda/envs/torch1.10/lib/python3.9/site-packages/torch/include/THC/THC.h

and the THCudaCheck is in

anaconda/envs/torch1.10/lib/python3.9/site-packages/torch/include/THC/THCGeneral.h

vacancy avatar Apr 29 '22 19:04 vacancy