CenterNet get error when run demo _dcn_v2.so: undefined symbol: _

Hi, i try to run the demo, but after finished exactly the same step with the guide of INSTALL.md and no error encounted, i encounted the problem of "there no e_dcn_v2.so: undefined symbol: __cudaRegisterFatBinaryEnd" when run the demo, it seems that it come from the module of DCNv2. my cuda version is 9.2. I also try to update the pytorch 1.0 version of DCNv2 and pytorch1.0, but Segmentation fault(core dump). ======follow is detail of error information==== Traceback (most recent call last): File "demo.py", line 11, in from detectors.detector_factory import detector_factory File "/home/glt/oneshot/CenterNet/src/lib/detectors/detector_factory.py", line 5, in from .exdet import ExdetDetector File "/home/glt/oneshot/CenterNet/src/lib/detectors/exdet.py", line 22, in from .base_detector import BaseDetector File "/home/glt/oneshot/CenterNet/src/lib/detectors/base_detector.py", line 11, in from models.model import create_model, load_model File "/home/glt/oneshot/CenterNet/src/lib/models/model.py", line 12, in from .networks.pose_dla_dcn import get_pose_net as get_dla_dcn File "/home/glt/oneshot/CenterNet/src/lib/models/networks/pose_dla_dcn.py", line 16, in from .DCNv2.dcn_v2 import DCN File "/home/glt/oneshot/CenterNet/src/lib/models/networks/DCNv2/dcn_v2.py", line 11, in from .dcn_v2_func import DCNv2Function File "/home/glt/oneshot/CenterNet/src/lib/models/networks/DCNv2/dcn_v2_func.py", line 9, in from ._ext import dcn_v2 as _backend File "/home/glt/oneshot/CenterNet/src/lib/models/networks/DCNv2/_ext/dcn_v2/init.py", line 3, in from ._dcn_v2 import lib as _lib, ffi as _ffi ImportError: /home/glt/oneshot/CenterNet/src/lib/models/networks/DCNv2/_ext/dcn_v2/_dcn_v2.so: undefined symbol: __cudaRegisterFatBinaryEnd

======================pytorch 1.0 version ================= $ python demo.py ctdet --demo ../images --load_model ../models/ctdet_coco_dla_2x.pth Fix size testing. training chunk_sizes: [32] The output will be saved to /home/glt/CenterNet/src/lib/../../exp/ctdet/default heads {'hm': 80, 'wh': 2, 'reg': 2} Creating model... loaded ../models/ctdet_coco_dla_2x.pth, epoch 230 段错误 (核心已转储)

May 11 '19 03:05 guanlinting

I have no idea about this, but found this for you if it can help.

May 13 '19 14:05 xingyizhou

I met the same problem. pytorch 0.4.1 CUDA 10.0 And when I run: python demo.py ctdet --demo ../images --load_model ../models/ctdet_coco_dla_2x.pth

Traceback (most recent call last): File "demo.py", line 11, in from detectors.detector_factory import detector_factory File "/home/CenterNet/src/lib/detectors/detector_factory.py", line 5, in from .exdet import ExdetDetector File "/home/CenterNet/src/lib/detectors/exdet.py", line 22, in from .base_detector import BaseDetector File "/home/CenterNet/src/lib/detectors/base_detector.py", line 11, in from models.model import create_model, load_model File "/home/CenterNet/src/lib/models/model.py", line 12, in from .networks.pose_dla_dcn import get_pose_net as get_dla_dcn File "/home/CenterNet/src/lib/models/networks/pose_dla_dcn.py", line 16, in from .DCNv2.dcn_v2 import DCN File "/home/CenterNet/src/lib/models/networks/DCNv2/dcn_v2.py", line 11, in from .dcn_v2_func import DCNv2Function File "/home/CenterNet/src/lib/models/networks/DCNv2/dcn_v2_func.py", line 9, in from ._ext import dcn_v2 as _backend File "/home/CenterNet/src/lib/models/networks/DCNv2/_ext/dcn_v2/init.py", line 3, in from ._dcn_v2 import lib as _lib, ffi as _ffi ImportError: /home/CenterNet/src/lib/models/networks/DCNv2/_ext/dcn_v2/_dcn_v2.so: undefined symbol: __cudaPopCallConfiguration

May 27 '19 03:05 PumayHui

I am the same with you @PumayHui, how to solve it?

Jun 03 '19 03:06 wwlbytedance

I have met the same problom with you @PumayHui .Did you solve it ?

Jun 03 '19 03:06 SeeeeShiwei

@wwlbytedance @ShiSenSen1234 Sorry, I have not solved...

Jun 03 '19 03:06 PumayHui

@xingyizhou Can you help us~~~?

Jun 03 '19 05:06 SeeeeShiwei

undefined symbol: __cudaPopCallConfiguration: Ensure that your PyTorch CUDA version and system CUDA version match (see Issue#19):

$ python -c "import torch; print(torch.version.cuda)"
$ nvcc --version

I get 9.0 and 9.2 so I install pytorch conda install pytorch=0.4.1 cuda92 -c pytorch

Jun 13 '19 10:06 uniquezhengjie

It's cuda version problem. Use cuda9 and pytorch 0.4.1 will fix it.

Jun 14 '19 18:06 guanxiongsun

I encountered the same problem, when using CUDA 10.1 and pytorch 0.4.1. Yes, it's CUDA version problem. I switch CUDA 10.1 to CUDA 8.0, solved this problem.

Jun 27 '19 08:06 hktxt

https://pytorch.org/ to get the right version pytorch for your cuda version

like cuda 10 conda install pytorch torchvision cudatoolkit==10.0 -c pytorch that will fix the problem

Aug 03 '19 15:08 clemente0731

still can't fixed the problem with cuda9 and torch0.4.1, so strange.... There is no cudnn on my serve machine, is that a problem ?

Oct 01 '19 14:10 Stephenfang51

I met the same problem. pytorch 0.4.1 CUDA 10.0 And when I run: python demo.py ctdet --demo ../images --load_model ../models/ctdet_coco_dla_2x.pth

Traceback (most recent call last): File "demo.py", line 11, in from detectors.detector_factory import detector_factory File "/home/CenterNet/src/lib/detectors/detector_factory.py", line 5, in from .exdet import ExdetDetector File "/home/CenterNet/src/lib/detectors/exdet.py", line 22, in from .base_detector import BaseDetector File "/home/CenterNet/src/lib/detectors/base_detector.py", line 11, in from models.model import create_model, load_model File "/home/CenterNet/src/lib/models/model.py", line 12, in from .networks.pose_dla_dcn import get_pose_net as get_dla_dcn File "/home/CenterNet/src/lib/models/networks/pose_dla_dcn.py", line 16, in from .DCNv2.dcn_v2 import DCN File "/home/CenterNet/src/lib/models/networks/DCNv2/dcn_v2.py", line 11, in from .dcn_v2_func import DCNv2Function File "/home/CenterNet/src/lib/models/networks/DCNv2/dcn_v2_func.py", line 9, in from ._ext import dcn_v2 as _backend File "/home/CenterNet/src/lib/models/networks/DCNv2/_ext/dcn_v2/init.py", line 3, in from ._dcn_v2 import lib as _lib, ffi as _ffi ImportError: /home/CenterNet/src/lib/models/networks/DCNv2/_ext/dcn_v2/_dcn_v2.so: undefined symbol: __cudaPopCallConfiguration

https://github.com/CharlesShang/DCNv2 Using the pytorch version 1.0 of the deformable convnets worked for me. @xingyizhou could you verify this once

Oct 09 '19 08:10 earlfernando

cuda 10.0 pytorch 0.4.1 有人解决吗

Oct 24 '19 14:10 zjp99

https://pytorch.org/ to get the right version pytorch for your cuda version

like cuda 10 conda install pytorch torchvision cudatoolkit==10.0 -c pytorch that will fix the problem

Did you solve it?

Oct 25 '19 00:10 zjp99

我的环境cuda 10.0 py36 pytorch0.4 出现问题

于是我克隆了环境，尝试方法解决，首先我更换了pytorch 1.1 和 torchvision0.3 并且更换DCNv2，最终问题解决

具体参考https://blog.csdn.net/weixin_38705903/article/details/102598339的4.2和6

Oct 25 '19 01:10 zjp99

I'm using CUDA 10.1, I solved the problem by installing pytorch 1.2.0 and replace the DCNv2 in this repo with the original repo and compile it again.

Now it works perfectly.

Nov 20 '19 12:11 kwea123

For CUDA 10.1 + pytorch1 use tag pytorch_1.0

For CUDA 9.0+pytorch0.4 use tag pytorch_0.4

Dec 11 '19 20:12 bazinga012

As @uniquezhengjie said, this error raised when PyTorch CUDA and system CUDA version do not match. The Deformable_Convolution (DCN) in this repository requires CUDA version <=10.0, thus this repository use PyTorch0.4(which only supports CUDA version<=10.0), the error raised when your system CUDA version >=10.0.

If you do not want to downgrade your system CUDA version, it seems that you need to adopt another DCN which supports CUDA>=10.0.

conda install pytorch=1.0 torchvision -c pytorch
Change your DCN, according to @zjp99

cd ~/Code/CenterNet/src/lib/models/networks rm -r DCNv2 git clone https://github.com/CharlesShang/DCNv2.git cd DCNv2 sh make.sh python test.py
python demo.py ctdet --demo /path/to/image/or/folder/or/video --load_model ../models/ctdet_coco_dla_2x.pth

In my experiments, this solved the problem.

Feb 17 '20 03:02 deepalchemist

I'm using CUDA 10.1, I solved the problem by installing pytorch 1.2.0 and replace the DCNv2 in this repo with the original repo and compile it again.

Now it works perfectly.

@kwea123 I'm doing exactly as you are suggesting -- cuda 10.1, pytorch 1.2.0, with the replaced DCNv2. However, when I execute: python demo.py ctdet --demo ../images/ --load_model ../models/ctdet_coco_dla_2x.pth I get the following error: AssertionError: Torch not compiled with CUDA enabled Am I missing something?

Here is the full stack trace: Fix size testing. training chunk_sizes: [32] The output will be saved to /home/shihkuan/workFiles/centernet/PythonAPI/CenterNet/src/lib/../../exp/ctdet/default heads {'hm': 80, 'wh': 2, 'reg': 2} Creating model... loaded ../models/ctdet_coco_dla_2x.pth, epoch 230 Traceback (most recent call last): File "demo.py", line 56, in demo(opt) File "demo.py", line 21, in demo detector = Detector(opt) File "/home/shihkuan/workFiles/centernet/PythonAPI/CenterNet/src/lib/detectors/ctdet.py", line 26, in init super(CtdetDetector, self).init(opt) File "/home/shihkuan/workFiles/centernet/PythonAPI/CenterNet/src/lib/detectors/base_detector.py", line 26, in init self.model = self.model.to(opt.device) File "/home/shihkuan/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 432, in to return self._apply(convert) File "/home/shihkuan/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 208, in _apply module._apply(fn) File "/home/shihkuan/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 208, in _apply module._apply(fn) File "/home/shihkuan/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 208, in _apply module._apply(fn) File "/home/shihkuan/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 230, in _apply param_applied = fn(param) File "/home/shihkuan/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 430, in convert return t.to(device, dtype if t.is_floating_point() else None, non_blocking) File "/home/shihkuan/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/cuda/init.py", line 178, in _lazy_init _check_driver() File "/home/shihkuan/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/cuda/init.py", line 92, in _check_driver raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled

Jun 19 '20 07:06 alexrider1105

As @uniquezhengjie said, this error raised when PyTorch CUDA and system CUDA version do not match. The Deformable_Convolution (DCN) in this repository requires CUDA version <=10.0, thus this repository use PyTorch0.4(which only supports CUDA version<=10.0), the error raised when your system CUDA version >=10.0.

If you do not want to downgrade your system CUDA version, it seems that you need to adopt another DCN which supports CUDA>=10.0.

conda install pytorch=1.0 torchvision -c pytorch

Change your DCN, according to @zjp99 cd ~/Code/CenterNet/src/lib/models/networks rm -r DCNv2 git clone https://github.com/CharlesShang/DCNv2.git cd DCNv2 sh make.sh python test.py

python demo.py ctdet --demo /path/to/image/or/folder/or/video --load_model ../models/ctdet_coco_dla_2x.pth

In my experiments, this solved the problem.

When I use pytorch version 1.0 with cuda version 10.1, as you suggested, and run: python demo.py ctdet --demo ../images/ --load_model ../models/ctdet_coco_dla_2x.pth I get the following error:

ImportError: /home/shihkuan/workFiles/centernet/PythonAPI/CenterNet/src/lib/models/networks/DCNv2/_ext.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe26detail37_typeMetaDataInstance_preallocated_32E

do you happen to know what is causing this?

Jun 19 '20 07:06 alexrider1105

As @uniquezhengjie said, this error raised when PyTorch CUDA and system CUDA version do not match. The Deformable_Convolution (DCN) in this repository requires CUDA version <=10.0, thus this repository use PyTorch0.4(which only supports CUDA version<=10.0), the error raised when your system CUDA version >=10.0.

If you do not want to downgrade your system CUDA version, it seems that you need to adopt another DCN which supports CUDA>=10.0.

conda install pytorch=1.0 torchvision -c pytorch

Change your DCN, according to @zjp99 cd ~/Code/CenterNet/src/lib/models/networks rm -r DCNv2 git clone https://github.com/CharlesShang/DCNv2.git cd DCNv2 sh make.sh python test.py

python demo.py ctdet --demo /path/to/image/or/folder/or/video --load_model ../models/ctdet_coco_dla_2x.pth

In my experiments, this solved the problem.

Do I have to degrade my system CUDA version? I created a new conda environment and install pytorch 0.4.1 and cuda 9.0 in this new environment,but when I run demo.py this error still occured, so I wonder the relation of system CUDA and the cuda in conda, Can you give me any suggestions?

Feb 20 '23 08:02 lihuining

As @uniquezhengjie said, this error raised when PyTorch CUDA and system CUDA version do not match. The Deformable_Convolution (DCN) in this repository requires CUDA version <=10.0, thus this repository use PyTorch0.4(which only supports CUDA version<=10.0), the error raised when your system CUDA version >=10.0. If you do not want to downgrade your system CUDA version, it seems that you need to adopt another DCN which supports CUDA>=10.0.

conda install pytorch=1.0 torchvision -c pytorch

Change your DCN, according to @zjp99 cd ~/Code/CenterNet/src/lib/models/networks rm -r DCNv2 git clone https://github.com/CharlesShang/DCNv2.git cd DCNv2 sh make.sh python test.py

python demo.py ctdet --demo /path/to/image/or/folder/or/video --load_model ../models/ctdet_coco_dla_2x.pth

In my experiments, this solved the problem.

Do I have to degrade my system CUDA version? I created a new conda environment and install pytorch 0.4.1 and cuda 9.0 in this new environment,but when I run demo.py this error still occured, so I wonder the relation of system CUDA and the cuda in conda, Can you give me any suggestions?

@lihuining have you solved this problem? I meet the same problem

Jun 07 '23 23:06 CedrusLNZ

As @uniquezhengjie said, this error raised when PyTorch CUDA and system CUDA version do not match. The Deformable_Convolution (DCN) in this repository requires CUDA version <=10.0, thus this repository use PyTorch0.4(which only supports CUDA version<=10.0), the error raised when your system CUDA version >=10.0.

If you do not want to downgrade your system CUDA version, it seems that you need to adopt another DCN which supports CUDA>=10.0.

conda install pytorch=1.0 torchvision -c pytorch

Change your DCN, according to @zjp99 cd ~/Code/CenterNet/src/lib/models/networks rm -r DCNv2 git clone https://github.com/CharlesShang/DCNv2.git cd DCNv2 sh make.sh python test.py

python demo.py ctdet --demo /path/to/image/or/folder/or/video --load_model ../models/ctdet_coco_dla_2x.pth

In my experiments, this solved the problem.

@DeepAlchemist

Does this project have two branch? When I use pytorch 0.4.1 and cuda90, compile error running build_ext building '_ext' extension g++ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home1/linazhan/CenterTrack/src/lib/model/networks/DCNv2/src -I/home1/linazhan/.conda/envs/CenterTrack3/lib/python3.6/site-packages/torch/lib/include -I/home1/linazhan/.conda/envs/CenterTrack3/lib/python3.6/site-packages/torch/lib/include/TH -I/home1/linazhan/.conda/envs/CenterTrack3/lib/python3.6/site-packages/torch/lib/include/THC -I/spack/apps/linux-centos7-x86_64/gcc-4.9.4/cuda-9.2.88-nak6j4dtwls6r42eaqmpx5krncqhwrnh/include -I/home1/linazhan/.conda/envs/CenterTrack3/include/python3.6m -c /home1/linazhan/CenterTrack/src/lib/model/networks/DCNv2/src/vision.cpp -o build/temp.linux-x86_64-3.6/home1/linazhan/CenterTrack/src/lib/model/networks/DCNv2/src/vision.o -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 cc1plus: warning: command line option '-Wstrict-prototypes' is valid for C/ObjC but not for C++ In file included from /home1/linazhan/CenterTrack/src/lib/model/networks/DCNv2/src/dcn_v2.h:3:0, from /home1/linazhan/CenterTrack/src/lib/model/networks/DCNv2/src/vision.cpp:2: /home1/linazhan/CenterTrack/src/lib/model/networks/DCNv2/src/cpu/vision.h:2:29: fatal error: torch/extension.h: No such file or directory #include <torch/extension.h>

I search google which says the pytorch version too low.

Jun 08 '23 00:06 CedrusLNZ

It's cuda version problem. Use cuda9 and pytorch 0.4.1 will fix it.

@guanxiongsun when I use cuda9 and pytorch 0.4.1 to compile dcnv2, it reports/home1/linazhan/CenterTrack/src/lib/model/networks/DCNv2/src/cpu/vision.h:2:29: fatal error: torch/extension.h: No such file or directory #include <torch/extension.h>

Jun 08 '23 00:06 CedrusLNZ

get error when run demo _dcn_v2.so: undefined symbol: __cudaRegisterFatBinaryEnd

我的环境cuda 10.0 py36 pytorch0.4 出现问题

于是我克隆了环境，尝试方法解决，首先我更换了pytorch 1.1 和 torchvision0.3 并且更换DCNv2，最终问题解决

具体参考https://blog.csdn.net/weixin_38705903/article/details/102598339的4.2和6