DCNv2 icon indicating copy to clipboard operation
DCNv2 copied to clipboard

Pytorch 1.6-1.8 compatability - CUDA11/3090 ready

Open MatthewHowe opened this issue 4 years ago • 59 comments

Modified from pull request from @half-potato for compatibility with torch 1.6. Replaced THBlas functions with aten tensor functions. Tested for torch 1.7 and 1.8 with cuda 10 and 11. Worked with RTX2080 and RTX3090. @xdynames

MatthewHowe avatar Nov 13 '20 13:11 MatthewHowe

#90 #89 #88 #74

MatthewHowe avatar Nov 13 '20 13:11 MatthewHowe

Hi! Matthew, I have a RTX3090, and cloned your project that you modified 14 hours ago. While ./make.sh still get the error about: nvcc fatal : Unsupported gpu architecture 'compute_86' image

I got a Ubuntu 18.04, CUDA 11.1 pytorch 1.7, and gcc 7.5.0 / g++ 7.5.0

I guess it's probably the CUDA caused error, ANY HELP WOULD BE APPRECIATED!!

jerryhitit avatar Nov 14 '20 03:11 jerryhitit

Jerry do you install you use a nightly binary for your Pytorch? https://discuss.pytorch.org/t/rtx-3000-support/98158

I have built this in a docker container using Nvidia's base image of CUDA11.1 then using the pip command in the link to install pytorch compiled with the RTX3000 support and it seems to work well (@MatthewHowe what base image did you use?)

From some googling it looks like it could also be conflicting versions of different nvidia packages, nvcc, cudnn, ect

XDynames avatar Nov 14 '20 03:11 XDynames

Thanks, @XDynames . I used to got a pytorch from pip install torch==1.7.0+cu110 torchvision==0.8.1+cu110 torchaudio===0.7.0 -f https://download.pytorch.org/whl/torch_stable.html

And under your suggestion about I should use nightly binary, so I use the pip command: pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu110/torch_nightly.html

And my pytorch version looks like now: image

And now I got a ninja related error like this: image result in the RuntimeError: Error compiling objects for extension

I will double check all the NVIDIA packages, and find a way to solve the ninja problem. Thanks AGAIN!

jerryhitit avatar Nov 14 '20 03:11 jerryhitit

I used this [docker image]docker pull nvidia/cuda:11.1-devel-ubuntu18.04 - installed conda then torch-nightly. I then cloned and compiled DCNv2. This could be an issue with Cuda11.0 or some other conflicting packages. When DCN doesn't compile usually the error from the cause is above your screen cap - if you run the ./make again the compiled parts will not run and it will make it clearer what is causing the issue.

MatthewHowe avatar Nov 14 '20 07:11 MatthewHowe

Hi, @MatthewHowe Thanks for your great abvice!

I double checked my CUDA installation, and nvcc settings. After proper set those environment variables. It won't cause the correspond errors like ['nvcc', '-v'].

While on the contrary, ninja still have report an error about the FAIL in 'THCudaBlas_SgemmBatched'. It seems to be a new problem.

The log is like this:

FAILED: /home/liurui/DCNv2/build/temp.linux-x86_64-3.7/home/liurui/DCNv2/DCN/src/cuda/dcn_v2_cuda.o /usr/local/cuda-11.1/bin/nvcc -DWITH_CUDA -I/home/liurui/DCNv2/DCN/src -I/home/liurui/anaconda3/envs/FairMOT/lib/python3.7/site-packages/torch/include -I/home/liurui/anaconda3/envs/FairMOT/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/liurui/anaconda3/envs/FairMOT/lib/python3.7/site-packages/torch/include/TH -I/home/liurui/anaconda3/envs/FairMOT/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda-11.1/include -I/home/liurui/anaconda3/envs/FairMOT/include/python3.7m -c -c /home/liurui/DCNv2/DCN/src/cuda/dcn_v2_cuda.cu -o /home/liurui/DCNv2/build/temp.linux-x86_64-3.7/home/liurui/DCNv2/DCN/src/cuda/dcn_v2_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=sm_80 -ccbin g++ -std=c++14 /home/liurui/DCNv2/DCN/src/cuda/dcn_v2_cuda.cu(126): error: identifier "THCudaBlas_SgemmBatched" is undefined

Sorry. I FIX this problem by degrading my pytorch 1.8 nightly binary to 1.7 stable version. Because the THCudaBlas_SgemmBatched is modified in recent version, so it caused this problem.

It work will, and compile successfully.

AND Thanks for Matthew‘s great work again!!

jerryhitit avatar Nov 15 '20 06:11 jerryhitit

Just looked into this and ATEN lost this definition on the 13NOV.....

Maybe we should look into replacing SgemmBatched with a non deprecated version for 1.8 support? https://github.com/pytorch/pytorch/issues/47987

XDynames avatar Nov 15 '20 09:11 XDynames

pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu110/torch_nightly.html

the same error

Shank2358 avatar Nov 16 '20 09:11 Shank2358

pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu110/torch_nightly.html

the same error

you can try downgrade pytorch version to 1.7 stable, it work fine with me.

jerryhitit avatar Nov 16 '20 10:11 jerryhitit

pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu110/torch_nightly.html

the same error

you can try downgrade pytorch version to 1.7 stable, it work fine with me. Thank you. I will try it again.

Shank2358 avatar Nov 16 '20 11:11 Shank2358

I have compiled successfully using pytorch1.7. Thanks. @jerryhitit @MatthewHowe

Shank2358 avatar Nov 17 '20 02:11 Shank2358

I successfully compiled on Windows 10, CUDA 11.1 (RTX3090), and PyTorch 1.7. Thank you so much!

duanzhiihao avatar Nov 30 '20 13:11 duanzhiihao

@MatthewHowe Hi Matthew, I failed to compile using pytorch1.7 with RuntimeError: Error compiling objects for extension.

I used the latest version of you which supports pytorch1.7 My environment (using anaconda virtual env): image

gcc 7.5.0 ninja 1.10.2 ubuntu 18.04 python 3.7 pytorch 1.7 cudatoolkit 10.2

torch.cuda.is_available return True and CUDA home is not None

Error Message: ninja: build stopped: subcommand failed. Traceback (most recent call last): File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1522, in _run_ninja_build env=env) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/subprocess.py", line 481, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "setup.py", line 69, in cmdclass={"build_ext": torch.utils.cpp_extension.BuildExtension}, File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/setuptools/init.py", line 153, in setup return distutils.core.setup(**attrs) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/core.py", line 148, in setup dist.run_commands() File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/dist.py", line 966, in run_commands self.run_command(cmd) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build.py", line 135, in run self.run_command(cmd_name) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 79, in run _build_ext.run(self) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py", line 339, in run self.build_extensions() File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 653, in build_extensions build_ext.build_extensions(self) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py", line 448, in build_extensions self._build_extensions_serial() File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py", line 473, in _build_extensions_serial self.build_extension(ext) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 196, in build_extension _build_ext.build_extension(self, ext) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py", line 533, in build_extension depends=ext.depends) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 482, in unix_wrap_ninja_compile with_cuda=with_cuda) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1238, in _write_ninja_file_and_compile_objects error_prefix='Error compiling objects for extension') File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1538, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error compiling objects for extension

Could you help me?

KiedaTamashi avatar Dec 02 '20 14:12 KiedaTamashi

Double check that your versions all line up - if you want to use CUDA 10.2 make sure CUDNN is the correct version and the pytorch binary you are using is compiled with CUDA 10.2

XDynames avatar Dec 04 '20 05:12 XDynames

Double check that your versions all line up - if you want to use CUDA 10.2 make sure CUDNN is the correct version and the pytorch binary you are using is compiled with CUDA 10.2

Hi @XDynames , I solved this by modifying my python interrupter file "anaconda3/envs/py37/lib/python3.7/site-packages/torch/utils/cpp_extension.py"

But I met another gcc compile problem.

running install running bdist_egg running egg_info writing DCNv2.egg-info/PKG-INFO writing dependency_links to DCNv2.egg-info/dependency_links.txt writing top-level names to DCNv2.egg-info/top_level.txt reading manifest file 'DCNv2.egg-info/SOURCES.txt' writing manifest file 'DCNv2.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_py running build_ext building '_ext' extension Emitting ninja build file /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) 1.10.2 g++ -pthread -shared -B /NAS/home01/tanzhenwei/anaconda3/envs/py37/compiler_compat -L/NAS/home01/tanzhenwei/anaconda3/envs/py37/lib -Wl,-rpath=/NAS/home01/tanzhenwei/anaconda3/envs/py37/lib -Wl,--no-as-needed -Wl,--sysroot=/ /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/vision.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cpu/dcn_v2_cpu.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cpu/dcn_v2_im2col_cpu.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cpu/dcn_v2_psroi_pooling_cpu.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_cuda.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_psroi_pooling_cuda.o -L/NAS/home01/tanzhenwei/anaconda3/envs/py37/lib/python3.7/site-packages/torch/lib -L/usr/local/cuda/lib64 -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-x86_64-3.7/_ext.cpython-37m-x86_64-linux-gnu.so g++: error: /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_cuda.o: No such file or directory g++: error: /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.o: No such file or directory g++: error: /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_psroi_pooling_cuda.o: No such file or directory error: command 'g++' failed with exit status 1

Do you have any advice? The environment is the same.

KiedaTamashi avatar Dec 07 '20 10:12 KiedaTamashi

@MatthewHowe Hi MatthewHowe. Thanks for your great job, I successfully compiled on Ubuntu18.04.5, CUDA 11.1 (RTX3090), and PyTorch 1.7. 0 . For there still some packages need to be compiled manually. I wonder if there are some guidelines , principles or rules to modify the source code from CUDA10(even earlier versions) version to CUDA 11 version so that I can compiled it with current environment. Though I browsed the files changed, i still have no idea about how to do it properly. Would you mind provide some guidance? Looking forward for your reply.

ConnerWK avatar Dec 17 '20 16:12 ConnerWK

@ConnerWK Not to put a fine point on it but the code for DCN has become a bit messy - what we have done was to replae low level BLAS & CUDABLAS function calls with a higher level ATEN equivalent

This is viewed by us as a band-aid, so we've started working on a pure pytorch NN.module based solution that will not require compiling. Currently we have deformable convolution V1/2 passing all the unit tests from this code but have yet to break ground on ROI pooling

Let me know if this is something you'd be interested in

XDynames avatar Dec 17 '20 22:12 XDynames

@XDynames Thanks for your quick reply. Your project sounds great! I look forward to seeing your new achievements soon.

Be honest, I'm a novice in this field with poor knowledge and experience. Recently, I meet troubles in compiling two packages listed below(extreme_utils and ROI align layer which is a util part of a modified CenterNet).

https://github.com/zju3dv/snake/tree/master/lib/csrc/extreme_utils https://github.com/zju3dv/snake/tree/master/lib/csrc/roi_align_layer

It works well in my old platform( UBUNTU18.04, RTX2080Ti, pytorch1.2.0, cuda10.0, nvidia-410 driver, if my memory serves me right). However, I can't compile it properly in the RTX3090 env(UBUNTU18.04.5, pytorch1.7.0, cuda11.1, nvidia-455 driver).

Luckily, I find your work here and compiled DCN successfully. For error messages are limited, I asked you if there are some rules or congruent relationship to modify source code. It seems I have a lot more to learn.

Thanks for your reply again! I look forward to seeing your new achievements soon.

ConnerWK avatar Dec 19 '20 18:12 ConnerWK

Can you solve this problem, I have compiled successfully in cuda11,pytorch1.7(RTX 3090), thank u very @MatthewHowe

error in modulated_deformable_im2col_cuda: no kernel image is available for execution on the device THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1607370156314/work/aten/src/THC/THCCachingHostAllocator.cpp line=278 error=700 : an illegal memory access was encountered terminate called after throwing an instance of 'std::runtime_error' what(): NCCL error in: /opt/conda/conda-bld/pytorch_1607370156314/work/torch/lib/c10d/../c10d/NCCLUtils.hpp:136, unhandled cuda error, NCCL version 2.7.8 error in modulated_deformable_im2col_cuda: no kernel image is available for execution on the device /opt/conda/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 33 leaked semaphores to clean up at shutdown len(cache)) Traceback (most recent call last): File "C_ddp.py", line 349, in main() File "C_ddp.py", line 109, in main mp.spawn(main_worker, nprocs=ngpus_per_node, args=(ngpus_per_node, args)) File "/opt/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 199, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/opt/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 157, in start_processes while not context.join(): File "/opt/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 118, in join raise Exception(msg) Exception:

-- Process 0 terminated with the following error: Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap fn(i, *args) File "/home/wj/Detection/CenterNetV2/C_ddp.py", line 277, in main_worker center_loss, center_fuse_loss, scale_loss, offset_loss = model({'img':img , 'label':label , 'heatmap_t':heatmap_t , 'hm_FuseClass_t':hm_FuseClass_t}) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 619, in forward output = self.module(*inputs[0], **kwargs[0]) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/wj/Detection/CenterNetV2/nets/resnet_dcn_model.py", line 35, in forward out=self.backbone(x)[0] File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/wj/Detection/CenterNetV2/networks/resnet_dcn.py", line 261, in forward x = self.deconv_layers(x) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward input = module(input) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 929, in forward output_padding, self.groups, self.dilation) RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR You can try to repro this exception using the following code snippet. If that doesn't trigger the error, please include your original repro script when reporting this issue.

import torch torch.backends.cuda.matmul.allow_tf32 = True torch.backends.cudnn.benchmark = True torch.backends.cudnn.deterministic = True torch.backends.cudnn.allow_tf32 = True data = torch.randn([32, 256, 40, 40], dtype=torch.float, device='cuda', requires_grad=True) net = torch.nn.Conv2d(256, 256, kernel_size=[4, 4], padding=[1, 1], stride=[2, 2], dilation=[1, 1], groups=1) net = net.cuda().float() out = net(data) out.backward(torch.randn_like(out)) torch.cuda.synchronize()

ConvolutionParams data_type = CUDNN_DATA_FLOAT padding = [1, 1, 0] stride = [2, 2, 0] dilation = [1, 1, 0] groups = 1 deterministic = true allow_tf32 = true input: TensorDescriptor 0x55923cf1b4e0 type = CUDNN_DATA_FLOAT nbDims = 4 dimA = 32, 256, 40, 40, strideA = 409600, 1600, 40, 1, output: TensorDescriptor 0x55923cf1d1b0 type = CUDNN_DATA_FLOAT nbDims = 4 dimA = 32, 256, 20, 20, strideA = 102400, 400, 20, 1, weight: FilterDescriptor 0x55923cf4cec0 type = CUDNN_DATA_FLOAT tensor_format = CUDNN_TENSOR_NCHW nbDims = 4 dimA = 256, 256, 4, 4, Pointer addresses: input: 0x7fcf70a80000 output: 0x7fcf6fe00000 weight: 0x7fd1d1700000

WangJian981002 avatar Jan 06 '21 18:01 WangJian981002

Double check that your versions all line up - if you want to use CUDA 10.2 make sure CUDNN is the correct version and the pytorch binary you are using is compiled with CUDA 10.2

Hi @XDynames , I solved this by modifying my python interrupter file "anaconda3/envs/py37/lib/python3.7/site-packages/torch/utils/cpp_extension.py"

But I met another gcc compile problem.

running install running bdist_egg running egg_info writing DCNv2.egg-info/PKG-INFO writing dependency_links to DCNv2.egg-info/dependency_links.txt writing top-level names to DCNv2.egg-info/top_level.txt reading manifest file 'DCNv2.egg-info/SOURCES.txt' writing manifest file 'DCNv2.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_py running build_ext building '_ext' extension Emitting ninja build file /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) 1.10.2 g++ -pthread -shared -B /NAS/home01/tanzhenwei/anaconda3/envs/py37/compiler_compat -L/NAS/home01/tanzhenwei/anaconda3/envs/py37/lib -Wl,-rpath=/NAS/home01/tanzhenwei/anaconda3/envs/py37/lib -Wl,--no-as-needed -Wl,--sysroot=/ /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/vision.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cpu/dcn_v2_cpu.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cpu/dcn_v2_im2col_cpu.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cpu/dcn_v2_psroi_pooling_cpu.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_cuda.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_psroi_pooling_cuda.o -L/NAS/home01/tanzhenwei/anaconda3/envs/py37/lib/python3.7/site-packages/torch/lib -L/usr/local/cuda/lib64 -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-x86_64-3.7/_ext.cpython-37m-x86_64-linux-gnu.so g++: error: /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_cuda.o: No such file or directory g++: error: /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.o: No such file or directory g++: error: /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_psroi_pooling_cuda.o: No such file or directory error: command 'g++' failed with exit status 1

Do you have any advice? The environment is the same.

do you solve this problem? I find the same issue too .

WangJian981002 avatar Jan 07 '21 17:01 WangJian981002

pytorch version1.7 stable gcc 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC) CUDA Version: 11.0 I can run pytorch on other project, so pytorch and cuda version should match.

make return error as follow: Emitting ninja build file /home/opt/mot/DCNv2/build/temp.linux-x86_64-3.8/build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) FAILED: /home/opt/mot/DCNv2/build/temp.linux-x86_64-3.8/home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.o /usr/local/cuda/bin/nvcc -DWITH_CUDA -I/home/opt/mot/DCNv2/src -I/root/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/include -I/root/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/root/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/include/TH -I/root/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/root/anaconda3/envs/FairMOT/include/python3.8 -c -c /home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.cu -o /home/opt/mot/DCNv2/build/temp.linux-x86_64-3.8/home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75 -ccbin g++ -std=c++14 /home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.cu(107): error: identifier "THCState_getCurrentStream" is undefined /home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.cu(279): error: identifier "THCState_getCurrentStream" is undefined /home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.cu(324): error: identifier "THCudaBlas_Sgemv" is undefined 3 errors detected in the compilation of "/tmp/tmpxft_0011fd41_00000000-6_dcn_v2_cuda.cpp1.ii".

sparkfax avatar Jan 21 '21 06:01 sparkfax

pytorch version1.7 stable gcc 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC) CUDA Version: 11.0 I can run pytorch on other project, so pytorch and cuda version should match.

make return error as follow: Emitting ninja build file /home/opt/mot/DCNv2/build/temp.linux-x86_64-3.8/build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) FAILED: /home/opt/mot/DCNv2/build/temp.linux-x86_64-3.8/home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.o /usr/local/cuda/bin/nvcc -DWITH_CUDA -I/home/opt/mot/DCNv2/src -I/root/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/include -I/root/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/root/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/include/TH -I/root/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/root/anaconda3/envs/FairMOT/include/python3.8 -c -c /home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.cu -o /home/opt/mot/DCNv2/build/temp.linux-x86_64-3.8/home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75 -ccbin g++ -std=c++14 /home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.cu(107): error: identifier "THCState_getCurrentStream" is undefined /home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.cu(279): error: identifier "THCState_getCurrentStream" is undefined /home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.cu(324): error: identifier "THCudaBlas_Sgemv" is undefined 3 errors detected in the compilation of "/tmp/tmpxft_0011fd41_00000000-6_dcn_v2_cuda.cpp1.ii".

I use this version https://github.com/lbin/DCNv2, THCState_getCurrentStream" is undefined solved.

sparkfax avatar Jan 21 '21 08:01 sparkfax

Is there a solution for compliling this branch for PyTorch = 1.8 and CUDA = 11.1 (from torch.version.cuda)?

hhcs9527 avatar Feb 09 '21 12:02 hhcs9527

@hhcs9527 Not yet, we have a version of deformable convolution - not ROI pooling that does work with those versions but it is currently not working well in multi GPU training (very slow) You might be able to patch what is here again by working out a suitable ATEN function to replace the depreciated BLAS calls used - we felt like we'd be doing this for ever after literally having some of the functions we used as replacements deprecated in the next version of pytorch (which dropped a day after we submitted this pull request)

XDynames avatar Feb 09 '21 21:02 XDynames

@XDynames Thanks for your insight into this repository. Since the current version of PyTorch is unfriendly to RTX3090, I thought it might get great performance after updating to PyTorch1.8, but I can use PyTorch1.7.1 instead. Thanks for your great work.

hhcs9527 avatar Feb 10 '21 03:02 hhcs9527

@MatthewHowe嗨,Matthew,我无法使用pytorch1.7和RuntimeError进行编译:编译扩展对象时出错。

我使用了支持pytorch1.7 我的环境的最新版本(使用anaconda虚拟环境): 图像

gcc 7.5.0 忍者1.10.2 Ubuntu 18.04 python 3.7 pytorch 1.7 cudatoolkit 10.2

torch.cuda.is_available返回True,并且CUDA主页不是None

错误消息: ninja:构建停止:子命令失败。 追溯(最近一次通话): 文件“ /NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py”,第1522行,位于_run_ninja_build env = env ) 文件“ /NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/subprocess.py”,行481,在运行 output = stdout,stderr = stderr) 子进程中。 ,'-v']'返回非零退出状态1。

上面的异常是以下异常的直接原因:

追溯(最近一次通话): 文件“ setup.py”,第69行,在 cmdclass = {“ build_ext”中:torch.utils.cpp_extension.BuildExtension}, 文件“ / NAS / home01 / tanzhenwei / anaconda3 / envs / torch17 / lib / python3.7 / site-packages / setuptools / init.py“,第153行,在安装程序中 返回distutils.core.setup(** attrs) 文件” / NAS / home01 / tanzhenwei / anaconda3 / envs / torch17 / lib / python3.7 / distutils / core.py”,第148行,在安装程序 dist.run_commands()中, 文件“ /NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/dist.py”,行966,在run_commands self.run_command(cmd) 文件中,“ / NAS / home01 / tanzhenwei / anaconda3 / envs / torch17 / lib / python3.7 / distutils / dist.py”行985中,在run_command cmd_obj.run()中 运行 self.run_command(cmd_name)的文件“ /NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build.py”第135行, 文件“ / NAS / home01 / tanzhenwei / anaconda3 / envs / torch17 / lib / python3.7 / distutils / cmd.py”,第313行,位于run_command self.distribution.run_command(命令) 文件中,“ / NAS / home01 / tanzhenwei / anaconda3 / envs / torch17 / lib / python3 .7 / distutils / dist.py”,第985行,位于run_command cmd_obj.run() 文件中的“ /NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/setuptools/command/build_ext “ .py”,运行中的第79行 _build_ext.run(self) 文件“ /NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py”,运行中的339行 自己。build_extensions() 在build_extensions build_ext.build_extensions(self) 文件的第653行中,文件“ /NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py”,文件“ / NAS / home01 / tanzhenwei / anaconda3 / envs / torch17 / lib / python3.7 / distutils / command / build_ext.py“,行448,在build_extensions self._build_extensions_serial() 文件中,” / NAS / home01 / tanzhenwei / anaconda3 / envs / torch17 / lib / python3.7 / distutils / command / build_ext.py”,第473行,位于_build_extensions_serial self.build_extension(ext) 文件“ /NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages”中/setuptools/command/build_ext.py“,第196行,位于build_extension _build_ext.build_extension(self,ext)中 文件“ /NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py”,第533行,位于build_extensionDepend = ext.depends中) 文件“ / NAS / home01 / tanzhenwei / anaconda3 / envs / torch17 / lib / python3.7 / site-packages / torch / utils / cpp_extension.py“,第482行,在unix_wrap_ninja_compile with_cuda = with_cuda中) 文件“ / NAS / home01 / tanzhenwei / anaconda3 / envs / torch17 / lib /python3.7/site-packages/torch/utils/cpp_extension.py“,第1238行,在_write_ninja_file_and_compile_objects中 error_prefix ='为扩展名编译对象时出错') 文件“ / NAS / home01 / tanzhenwei / anaconda3 / envs / torch17 / lib / _run_ninja_build中的python3.7 / site-packages / torch / utils / cpp_extension.py“行1538 从e引发RuntimeError(message) RuntimeError:编译扩展对象时出错

你可以帮帮我吗?

@MatthewHowe Hi Matthew, I failed to compile using pytorch1.7 with RuntimeError: Error compiling objects for extension.

I used the latest version of you which supports pytorch1.7 My environment (using anaconda virtual env): image

gcc 7.5.0 ninja 1.10.2 ubuntu 18.04 python 3.7 pytorch 1.7 cudatoolkit 10.2

torch.cuda.is_available return True and CUDA home is not None

Error Message: ninja: build stopped: subcommand failed. Traceback (most recent call last): File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1522, in _run_ninja_build env=env) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/subprocess.py", line 481, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "setup.py", line 69, in cmdclass={"build_ext": torch.utils.cpp_extension.BuildExtension}, File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/setuptools/init.py", line 153, in setup return distutils.core.setup(**attrs) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/core.py", line 148, in setup dist.run_commands() File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/dist.py", line 966, in run_commands self.run_command(cmd) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build.py", line 135, in run self.run_command(cmd_name) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 79, in run _build_ext.run(self) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py", line 339, in run self.build_extensions() File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 653, in build_extensions build_ext.build_extensions(self) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py", line 448, in build_extensions self._build_extensions_serial() File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py", line 473, in _build_extensions_serial self.build_extension(ext) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 196, in build_extension _build_ext.build_extension(self, ext) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py", line 533, in build_extension depends=ext.depends) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 482, in unix_wrap_ninja_compile with_cuda=with_cuda) File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1238, in _write_ninja_file_and_compile_objects error_prefix='Error compiling objects for extension') File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1538, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error compiling objects for extension

Could you help me?

May I ask how you solve this problem

lyaling8230 avatar Mar 05 '21 14:03 lyaling8230

In this case I think the users cuda toolkit packages is version 10.2 and their cuda version is 11.1 causing a conflict

XDynames avatar Mar 05 '21 22:03 XDynames

In this case I think the users cuda toolkit packages is version 10.2 and their cuda version is 11.1 causing a conflict Hi, I checked the versions of my cuda toolkit packages and cuda version, they are all 11.0... and I have the same problem...

flow-specter avatar Mar 08 '21 09:03 flow-specter

I meet the same error,it make me crazy! @WangJian981002 @XiaoSanGit @XDynames Have you solved the problem? The version: gcc 7.5.0 g++ 7.5.0 ubuntu 18.04.4 python 3.8.5 pytorch 1.7 cuda 10.2.89

running build running build_ext building '_ext' extension Emitting ninja build file /home/tantianlong/DCNv2_latest-master/build/temp.linux-x86_64-3.8/build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) 1.7.2 g++ -pthread -shared -B /home/tantianlong/anaconda3/envs/pytorh_py38/compiler_compat -L/home/tantianlong/anaconda3/envs/pytorh_py38/lib -Wl,-rpath=/home/tantianlong/anaconda3/envs/pytorh_py38/lib -Wl,--no-as-needed -Wl,--sysroot=/ /home/tantianlong/DCNv2_latest-master/build/temp.linux-x86_64-3.8/home/tantianlong/DCNv2_latest-master/src/vision.o /home/tantianlong/DCNv2_latest-master/build/temp.linux-x86_64-3.8/home/tantianlong/DCNv2_latest-master/src/cpu/dcn_v2_psroi_pooling_cpu.o /home/tantianlong/DCNv2_latest-master/build/temp.linux-x86_64-3.8/home/tantianlong/DCNv2_latest-master/src/cpu/dcn_v2_im2col_cpu.o /home/tantianlong/DCNv2_latest-master/build/temp.linux-x86_64-3.8/home/tantianlong/DCNv2_latest-master/src/cpu/dcn_v2_cpu.o /home/tantianlong/DCNv2_latest-master/build/temp.linux-x86_64-3.8/home/tantianlong/DCNv2_latest-master/src/cuda/dcn_v2_im2col_cuda.o /home/tantianlong/DCNv2_latest-master/build/temp.linux-x86_64-3.8/home/tantianlong/DCNv2_latest-master/src/cuda/dcn_v2_psroi_pooling_cuda.o /home/tantianlong/DCNv2_latest-master/build/temp.linux-x86_64-3.8/home/tantianlong/DCNv2_latest-master/src/cuda/dcn_v2_cuda.o -L/home/tantianlong/anaconda3/envs/pytorh_py38/lib/python3.8/site-packages/torch/lib -L/usr/local/cuda-10.2/lib64 -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-x86_64-3.8/_ext.cpython-38-x86_64-linux-gnu.so g++: error: /home/tantianlong/DCNv2_latest-master/build/temp.linux-x86_64-3.8/home/tantianlong/DCNv2_latest-master/src/vision.o: No such file or directory g++: error: /home/tantianlong/DCNv2_latest-master/build/temp.linux-x86_64-3.8/home/tantianlong/DCNv2_latest-master/src/cpu/dcn_v2_psroi_pooling_cpu.o: No such file or directory g++: error: /home/tantianlong/DCNv2_latest-master/build/temp.linux-x86_64-3.8/home/tantianlong/DCNv2_latest-master/src/cpu/dcn_v2_im2col_cpu.o: No such file or directory g++: error: /home/tantianlong/DCNv2_latest-master/build/temp.linux-x86_64-3.8/home/tantianlong/DCNv2_latest-master/src/cpu/dcn_v2_cpu.o: No such file or directory g++: error: /home/tantianlong/DCNv2_latest-master/build/temp.linux-x86_64-3.8/home/tantianlong/DCNv2_latest-master/src/cuda/dcn_v2_im2col_cuda.o: No such file or directory g++: error: /home/tantianlong/DCNv2_latest-master/build/temp.linux-x86_64-3.8/home/tantianlong/DCNv2_latest-master/src/cuda/dcn_v2_psroi_pooling_cuda.o: No such file or directory g++: error: /home/tantianlong/DCNv2_latest-master/build/temp.linux-x86_64-3.8/home/tantianlong/DCNv2_latest-master/src/cuda/dcn_v2_cuda.o: No such file or directory error: command 'g++' failed with exit status 1

Xpangz avatar Mar 09 '21 14:03 Xpangz

@Xpangz this is a different error In your case the g++ linker is failing to find compiled objects it expects to be created by the first build stage

Double check all your versions, if you used the above pip install to get the pytorch binary compiled with cuda 11 it will not be compatible with your install version of cuda 10.2

Your cuda version, cuda toolkit binary and pytorch (what cuda/cudnn it was compiled with) all have to agree for this to build

XDynames avatar Mar 09 '21 21:03 XDynames