DCNv2_latest
DCNv2_latest copied to clipboard
Zero offet test failed.
Hi. The whole installation process is fine. All check in testcpu.py is OK. But when I run testcuda.py the zero offset check function always failed. Is it neglectable? Thank you.
I get the same error when run python testcuda.py
note: this is a known issue and may not be a serious problem.
environment
# conda list |grep pytorch
pytorch 1.7.1 py3.8_cuda11.0.221_cudnn8.0.5_0 pytorch
pytorch-lightning 1.4.9 pyhd8ed1ab_0 conda-forge
torchvision 0.8.2 py38_cu110 pytorch
# conda list |grep cu
cudatoolkit 11.0.221 h6bb024c_0
icu 58.2 he6710b0_3
ncurses 6.2 he6710b0_1
pytorch 1.7.1 py3.8_cuda11.0.221_cudnn8.0.5_0 pytorch
torchvision 0.8.2 py38_cu110 pytorch
error output
torch.Size([2, 64, 128, 128])
torch.Size([20, 32, 7, 7])
torch.Size([20, 32, 7, 7])
torch.Size([20, 32, 7, 7])
0.971507, 1.943014
0.971507, 1.943014
Zero offset passed
/opt/conda/lib/python3.8/site-packages/torch/autograd/gradcheck.py:301: UserWarning: The {}th input requires gradient and is not a double precision floating point or complex. This check will likely fail if all the inputs are not of double precision floating point or complex.
warnings.warn(
check_gradient_dpooling: True
Traceback (most recent call last):
File "testcuda.py", line 265, in <module>
check_gradient_dconv()
File "testcuda.py", line 95, in check_gradient_dconv
gradcheck(dcn_v2_conv, (input, offset, mask, weight, bias,
File "/opt/conda/lib/python3.8/site-packages/torch/autograd/gradcheck.py", line 401, in gradcheck return not_reentrant_error()
File "/opt/conda/lib/python3.8/site-packages/torch/autograd/gradcheck.py", line 398, in not_reentrant_error
return fail_test(error_msg)
File "/opt/conda/lib/python3.8/site-packages/torch/autograd/gradcheck.py", line 289, in fail_test
raise RuntimeError(msg)
RuntimeError: Backward is not reentrant, i.e., running backward with same
input and grad_output multiple times gives different values, although analytical gradient matches numerical gradient. The tolerance for nondeterminism was 0.0.
view https://github.com/CharlesShang/DCNv2/#known-issues and https://github.com/CharlesShang/DCNv2/issues/8
Update: all gradient check passes with double precision. Another issue is that it raises RuntimeError: Backward is not reentrant. However, the error is very small (<1e-7 for float <1e-15 for double), so it may not be a serious problem (?)
How did this problem be solved?