Does not work on CPU
When installed with CUDA support, the module does not work on CPU.
After installing DCNv2 on a device with GPU, I am able to successfully run both testcpu.py and testcuda.py.
However, when trying to run testcpu.py while hiding the GPU (CUDA_VISIBLE_DEVICES='' python testcpu.py), the test fails with a cuda runtime error on the import of _ext:
THCudaCheck FAIL file=../aten/src/THC/THCGeneral.cpp line=50 error=100 : no CUDA-capable device is detected
terminate called after throwing an instance of 'std::runtime_error'
what(): cuda runtime error (100) : no CUDA-capable device is detected at ../aten/src/THC/THCGeneral.cpp:50
Aborted (core dumped)
Any advice on how to solve this?
After digging some deeper I found that this issue can be avoided by replacing line 11 of src/cuda/dcn_v2_cuda.cu
THCState *state = at::globalContext().lazyInitCUDA();
by
extern THCState *state;
Even though everything seems to work after this change, I wonder if this has any hidden side effects?
Another solution, inspired by https://github.com/pytorch/pytorch/pull/11893/files, is to remove the same line and delaying the lazyInitCUDA to the start of dcn_v2_cuda_forward and dcn_v2_cuda_backward where the state is actually used.