get error when run demo _dcn_v2.so: undefined symbol: __cudaRegisterFatBinaryEnd
Hi, i try to run the demo, but after finished exactly the same step with the guide of INSTALL.md and no error encounted, i encounted the problem of "there no e_dcn_v2.so: undefined symbol: __cudaRegisterFatBinaryEnd" when run the demo, it seems that it come from the module of DCNv2. my cuda version is 9.2.
I also try to update the pytorch 1.0 version of DCNv2 and pytorch1.0, but Segmentation fault(core dump).
======follow is detail of error information====
Traceback (most recent call last):
File "demo.py", line 11, in
======================pytorch 1.0 version ================= $ python demo.py ctdet --demo ../images --load_model ../models/ctdet_coco_dla_2x.pth Fix size testing. training chunk_sizes: [32] The output will be saved to /home/glt/CenterNet/src/lib/../../exp/ctdet/default heads {'hm': 80, 'wh': 2, 'reg': 2} Creating model... loaded ../models/ctdet_coco_dla_2x.pth, epoch 230 段错误 (核心已转储)
I have no idea about this, but found this for you if it can help.
I met the same problem.
pytorch 0.4.1
CUDA 10.0
And when I run:
python demo.py ctdet --demo ../images --load_model ../models/ctdet_coco_dla_2x.pth
Traceback (most recent call last):
File "demo.py", line 11, in
I am the same with you @PumayHui, how to solve it?
I have met the same problom with you @PumayHui .Did you solve it ?
@wwlbytedance @ShiSenSen1234 Sorry, I have not solved...
@xingyizhou Can you help us~~~?
undefined symbol: __cudaPopCallConfiguration: Ensure that your PyTorch CUDA version and system CUDA version match (see Issue#19):
$ python -c "import torch; print(torch.version.cuda)"
$ nvcc --version
I get 9.0 and 9.2 so I install pytorch
conda install pytorch=0.4.1 cuda92 -c pytorch
It's cuda version problem. Use cuda9 and pytorch 0.4.1 will fix it.
I encountered the same problem, when using CUDA 10.1 and pytorch 0.4.1. Yes, it's CUDA version problem. I switch CUDA 10.1 to CUDA 8.0, solved this problem.
https://pytorch.org/ to get the right version pytorch for your cuda version
like cuda 10
conda install pytorch torchvision cudatoolkit==10.0 -c pytorch
that will fix the problem
still can't fixed the problem with cuda9 and torch0.4.1, so strange.... There is no cudnn on my serve machine, is that a problem ?
I met the same problem. pytorch 0.4.1 CUDA 10.0 And when I run:
python demo.py ctdet --demo ../images --load_model ../models/ctdet_coco_dla_2x.pthTraceback (most recent call last): File "demo.py", line 11, in from detectors.detector_factory import detector_factory File "/home/CenterNet/src/lib/detectors/detector_factory.py", line 5, in from .exdet import ExdetDetector File "/home/CenterNet/src/lib/detectors/exdet.py", line 22, in from .base_detector import BaseDetector File "/home/CenterNet/src/lib/detectors/base_detector.py", line 11, in from models.model import create_model, load_model File "/home/CenterNet/src/lib/models/model.py", line 12, in from .networks.pose_dla_dcn import get_pose_net as get_dla_dcn File "/home/CenterNet/src/lib/models/networks/pose_dla_dcn.py", line 16, in from .DCNv2.dcn_v2 import DCN File "/home/CenterNet/src/lib/models/networks/DCNv2/dcn_v2.py", line 11, in from .dcn_v2_func import DCNv2Function File "/home/CenterNet/src/lib/models/networks/DCNv2/dcn_v2_func.py", line 9, in from ._ext import dcn_v2 as _backend File "/home/CenterNet/src/lib/models/networks/DCNv2/_ext/dcn_v2/init.py", line 3, in from ._dcn_v2 import lib as _lib, ffi as _ffi ImportError: /home/CenterNet/src/lib/models/networks/DCNv2/_ext/dcn_v2/_dcn_v2.so: undefined symbol: __cudaPopCallConfiguration
https://github.com/CharlesShang/DCNv2 Using the pytorch version 1.0 of the deformable convnets worked for me. @xingyizhou could you verify this once
cuda 10.0 pytorch 0.4.1 有人解决吗
https://pytorch.org/ to get the right version pytorch for your cuda version
like cuda 10
conda install pytorch torchvision cudatoolkit==10.0 -c pytorchthat will fix the problem
Did you solve it?
我的环境cuda 10.0 py36 pytorch0.4 出现问题
于是我克隆了环境,尝试方法解决,首先我更换了pytorch 1.1 和 torchvision0.3 并且更换DCNv2,最终问题解决
具体参考https://blog.csdn.net/weixin_38705903/article/details/102598339的4.2和6
I'm using CUDA 10.1, I solved the problem by installing pytorch 1.2.0 and replace the DCNv2 in this repo with the original repo and compile it again.
Now it works perfectly.
As @uniquezhengjie said, this error raised when PyTorch CUDA and system CUDA version do not match. The Deformable_Convolution (DCN) in this repository requires CUDA version <=10.0, thus this repository use PyTorch0.4(which only supports CUDA version<=10.0), the error raised when your system CUDA version >=10.0.
If you do not want to downgrade your system CUDA version, it seems that you need to adopt another DCN which supports CUDA>=10.0.
-
conda install pytorch=1.0 torchvision -c pytorch
-
Change your DCN, according to @zjp99
cd ~/Code/CenterNet/src/lib/models/networks rm -r DCNv2 git clone https://github.com/CharlesShang/DCNv2.git cd DCNv2 sh make.sh python test.py
-
python demo.py ctdet --demo /path/to/image/or/folder/or/video --load_model ../models/ctdet_coco_dla_2x.pth
In my experiments, this solved the problem.
I'm using CUDA 10.1, I solved the problem by installing
pytorch 1.2.0and replace theDCNv2in this repo with the original repo and compile it again.Now it works perfectly.
@kwea123 I'm doing exactly as you are suggesting -- cuda 10.1, pytorch 1.2.0, with the replaced DCNv2. However, when I execute: python demo.py ctdet --demo ../images/ --load_model ../models/ctdet_coco_dla_2x.pth I get the following error: AssertionError: Torch not compiled with CUDA enabled Am I missing something?
Here is the full stack trace:
Fix size testing.
training chunk_sizes: [32]
The output will be saved to /home/shihkuan/workFiles/centernet/PythonAPI/CenterNet/src/lib/../../exp/ctdet/default
heads {'hm': 80, 'wh': 2, 'reg': 2}
Creating model...
loaded ../models/ctdet_coco_dla_2x.pth, epoch 230
Traceback (most recent call last):
File "demo.py", line 56, in
As @uniquezhengjie said, this error raised when PyTorch CUDA and system CUDA version do not match. The Deformable_Convolution (DCN) in this repository requires CUDA version <=10.0, thus this repository use PyTorch0.4(which only supports CUDA version<=10.0), the error raised when your system CUDA version >=10.0.
If you do not want to downgrade your system CUDA version, it seems that you need to adopt another DCN which supports CUDA>=10.0.
- conda install pytorch=1.0 torchvision -c pytorch
- Change your DCN, according to @zjp99 cd ~/Code/CenterNet/src/lib/models/networks rm -r DCNv2 git clone https://github.com/CharlesShang/DCNv2.git cd DCNv2 sh make.sh python test.py
- python demo.py ctdet --demo /path/to/image/or/folder/or/video --load_model ../models/ctdet_coco_dla_2x.pth
In my experiments, this solved the problem.
When I use pytorch version 1.0 with cuda version 10.1, as you suggested, and run: python demo.py ctdet --demo ../images/ --load_model ../models/ctdet_coco_dla_2x.pth I get the following error:
ImportError: /home/shihkuan/workFiles/centernet/PythonAPI/CenterNet/src/lib/models/networks/DCNv2/_ext.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe26detail37_typeMetaDataInstance_preallocated_32E
do you happen to know what is causing this?
As @uniquezhengjie said, this error raised when PyTorch CUDA and system CUDA version do not match. The Deformable_Convolution (DCN) in this repository requires CUDA version <=10.0, thus this repository use PyTorch0.4(which only supports CUDA version<=10.0), the error raised when your system CUDA version >=10.0.
If you do not want to downgrade your system CUDA version, it seems that you need to adopt another DCN which supports CUDA>=10.0.
- conda install pytorch=1.0 torchvision -c pytorch
- Change your DCN, according to @zjp99 cd ~/Code/CenterNet/src/lib/models/networks rm -r DCNv2 git clone https://github.com/CharlesShang/DCNv2.git cd DCNv2 sh make.sh python test.py
- python demo.py ctdet --demo /path/to/image/or/folder/or/video --load_model ../models/ctdet_coco_dla_2x.pth
In my experiments, this solved the problem.
Do I have to degrade my system CUDA version? I created a new conda environment and install pytorch 0.4.1 and cuda 9.0 in this new environment,but when I run demo.py this error still occured, so I wonder the relation of system CUDA and the cuda in conda, Can you give me any suggestions?
As @uniquezhengjie said, this error raised when PyTorch CUDA and system CUDA version do not match. The Deformable_Convolution (DCN) in this repository requires CUDA version <=10.0, thus this repository use PyTorch0.4(which only supports CUDA version<=10.0), the error raised when your system CUDA version >=10.0. If you do not want to downgrade your system CUDA version, it seems that you need to adopt another DCN which supports CUDA>=10.0.
- conda install pytorch=1.0 torchvision -c pytorch
- Change your DCN, according to @zjp99 cd ~/Code/CenterNet/src/lib/models/networks rm -r DCNv2 git clone https://github.com/CharlesShang/DCNv2.git cd DCNv2 sh make.sh python test.py
- python demo.py ctdet --demo /path/to/image/or/folder/or/video --load_model ../models/ctdet_coco_dla_2x.pth
In my experiments, this solved the problem.
Do I have to degrade my system CUDA version? I created a new conda environment and install pytorch 0.4.1 and cuda 9.0 in this new environment,but when I run demo.py this error still occured, so I wonder the relation of system CUDA and the cuda in conda, Can you give me any suggestions?
@lihuining have you solved this problem? I meet the same problem
As @uniquezhengjie said, this error raised when PyTorch CUDA and system CUDA version do not match. The Deformable_Convolution (DCN) in this repository requires CUDA version <=10.0, thus this repository use PyTorch0.4(which only supports CUDA version<=10.0), the error raised when your system CUDA version >=10.0.
If you do not want to downgrade your system CUDA version, it seems that you need to adopt another DCN which supports CUDA>=10.0.
- conda install pytorch=1.0 torchvision -c pytorch
- Change your DCN, according to @zjp99 cd ~/Code/CenterNet/src/lib/models/networks rm -r DCNv2 git clone https://github.com/CharlesShang/DCNv2.git cd DCNv2 sh make.sh python test.py
- python demo.py ctdet --demo /path/to/image/or/folder/or/video --load_model ../models/ctdet_coco_dla_2x.pth
In my experiments, this solved the problem.
@DeepAlchemist
Does this project have two branch? When I use pytorch 0.4.1 and cuda90, compile error running build_ext building '_ext' extension g++ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home1/linazhan/CenterTrack/src/lib/model/networks/DCNv2/src -I/home1/linazhan/.conda/envs/CenterTrack3/lib/python3.6/site-packages/torch/lib/include -I/home1/linazhan/.conda/envs/CenterTrack3/lib/python3.6/site-packages/torch/lib/include/TH -I/home1/linazhan/.conda/envs/CenterTrack3/lib/python3.6/site-packages/torch/lib/include/THC -I/spack/apps/linux-centos7-x86_64/gcc-4.9.4/cuda-9.2.88-nak6j4dtwls6r42eaqmpx5krncqhwrnh/include -I/home1/linazhan/.conda/envs/CenterTrack3/include/python3.6m -c /home1/linazhan/CenterTrack/src/lib/model/networks/DCNv2/src/vision.cpp -o build/temp.linux-x86_64-3.6/home1/linazhan/CenterTrack/src/lib/model/networks/DCNv2/src/vision.o -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 cc1plus: warning: command line option '-Wstrict-prototypes' is valid for C/ObjC but not for C++ In file included from /home1/linazhan/CenterTrack/src/lib/model/networks/DCNv2/src/dcn_v2.h:3:0, from /home1/linazhan/CenterTrack/src/lib/model/networks/DCNv2/src/vision.cpp:2: /home1/linazhan/CenterTrack/src/lib/model/networks/DCNv2/src/cpu/vision.h:2:29: fatal error: torch/extension.h: No such file or directory #include <torch/extension.h>
I search google which says the pytorch version too low.
It's cuda version problem. Use cuda9 and pytorch 0.4.1 will fix it.
@guanxiongsun when I use cuda9 and pytorch 0.4.1 to compile dcnv2, it reports/home1/linazhan/CenterTrack/src/lib/model/networks/DCNv2/src/cpu/vision.h:2:29: fatal error: torch/extension.h: No such file or directory #include <torch/extension.h>