faster-rcnn.pytorch icon indicating copy to clipboard operation
faster-rcnn.pytorch copied to clipboard

nvcc fatal : Unsupported gpu architecture 'compute_75'

Open alontrais opened this issue 4 years ago • 9 comments

I use Tesla T4, and add the CUDA_ARCH: -gencode arch=compute_75,code=sm_75

but when I run make.sh I get this error: nvcc fatal : Unsupported gpu architecture 'compute_75'

How do I fix this error?

alontrais avatar Feb 25 '20 08:02 alontrais

which version of cuda used? Compute capability 7.5 starts with CUDA 10.0...

loolzaaa avatar Feb 25 '20 14:02 loolzaaa

@loolzaaa Thank you so much for the response! I upgrade to CUDA 10.0 but now I get import error when I run the train_val script: ImportError: /home/trais_user/pipeline/algorithms/faster-rcnn.pytorch/lib/model/roi_crop/_ext/roi_crop/_roi_crop.so: undefined symbol: __cudaPopCallConfiguration

Do you know this error? Can I fix this error with CUDA 10?

alontrais avatar Feb 26 '20 13:02 alontrais

Switch branch of this repo to pytorch-1.0 if you can. Then try to build project with setup.py according readme.md.

loolzaaa avatar Feb 26 '20 13:02 loolzaaa

@loolzaaa Thank you! I switched the branch to pytorch-1.0: git checkout pytorch-1.0

I run the setup.py: cd lib python setup.py build

but I get an error: running build running build_py running build_ext building 'model._C' extension gcc -pthread -B /home/trais_user/.conda/envs/fasterrcnn/compiler_compat -Wl,--sysroot=/ -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/trais_user/pipeline/algorithms/faster-rcnn.pytorch/lib/model/csrc -I/home/trais_user/.conda/envs/fasterrcnn/lib/python2.7/site-packages/torch/lib/include -I/home/trais_user/.conda/envs/fasterrcnn/lib/python2.7/site-packages/torch/lib/include/TH -I/home/trais_user/.conda/envs/fasterrcnn/lib/python2.7/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/home/trais_user/.conda/envs/fasterrcnn/include/python2.7 -c /home/trais_user/pipeline/algorithms/faster-rcnn.pytorch/lib/model/csrc/vision.cpp -o build/temp.linux-x86_64-2.7/home/trais_user/pipeline/algorithms/faster-rcnn.pytorch/lib/model/csrc/vision.o -DTORCH_EXTENSION_NAME=model._C -std=c++11 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ In file included from /home/trais_user/pipeline/algorithms/faster-rcnn.pytorch/lib/model/csrc/nms.h:3:0, from /home/trais_user/pipeline/algorithms/faster-rcnn.pytorch/lib/model/csrc/vision.cpp:2: /home/trais_user/pipeline/algorithms/faster-rcnn.pytorch/lib/model/csrc/cpu/vision.h:3:10: fatal error: torch/extension.h: No such file or directory #include <torch/extension.h> ^~~~~~~~~~~~~~~~~~~ compilation terminated. error: command 'gcc' failed with exit status 1

Can I fix this error?

alontrais avatar Mar 03 '20 10:03 alontrais

#include <torch/extension.h>

This is standard include on PyTorch library.

I see that you use conda and old python 2.7. Which version of PyTorch do you use?

I think problem in pytorch library, because modern version is not supported python 2.x. So, If you can, create new environment in conda, install python 3.x (3.8 for example) and install last PyTorch version and try to build again.

loolzaaa avatar Mar 03 '20 14:03 loolzaaa

@loolzaaa I created a new environment in conda, I installed python 3.8 and installed last PyTorch version 1.4.0. I switched branch to pytorch-1.0, and builded again. When I run the trainval script I get faster-rcnn.pytorch/lib/model/roi_layers/nms.py", line 3, in from model import _C ImportError: cannot import name _C

When I switched branch to master and run the make.sh script I get .conda/envs/frcnn/lib/python2.7/site-packages/torch/utils/ffi/init.py", line 1, in raise ImportError("torch.utils.ffi is deprecated. Please use cpp extensions instead.") ImportError: torch.utils.ffi is deprecated. Please use cpp extensions instead.

I found here that I need to reduce pytorch version from 1.0 to 0.4 to resolve this issue.

How can I resolve this with pytorch 1.0?

alontrais avatar Mar 08 '20 09:03 alontrais

ImportError: cannot import name _C

If you build project with python setup.py build then you need to copy _C.....pyd library from lib/build/lib..../model folder to your lib/model folder.

If you build with python setup.py build develop then always should be ok.

loolzaaa avatar Mar 08 '20 11:03 loolzaaa

My configuration is: cuda 10.2 RTX2080i torch1.5 When I compile setup.py,it had the same problem:nvcc fatal : Unsupported gpu architecture 'compute_75'.My solution is to execute this code'export TORCH_CUDA_ARCH_LIST="7.0" 'before excute "python setup.py build develop"on the command. Hope to help you!

momo666666 avatar Jun 09 '20 11:06 momo666666

i got error: command '/usr/bin/nvcc' failed with exit status 1 while I tried to 'export TORCH_CUDA_ARCH_LIST="7.0" .. how can I solve it ?

SamMohel avatar Mar 23 '21 12:03 SamMohel