ReDet
ReDet copied to clipboard
CUDA error: an illegal memory access was encountered in roi_align backward funcation
Thanks for ur work, it's pretty pretty helpful.
conda environment:
mmcv 0.2.16
cuda 11.1
torch 1.8.0
RTX 3080
Dataset:
Fair1M for obbox detect
when in config.py use roi_layer=dict(type='RoIAlign', out_size=7, sample_num=2), i can only use two gpu, when use four, then print an error, " THCudaCheck FAIL file=ReDet/mmdet/ops/roi_align/src/roi_ane=292 error=700 : an illegal memory access was encountered ". But it's ok for use roi_layer=dict(type='RoIPool', out_size=7) to fully use 4 gpu. it's so weird.
Therefore i am sure there is a bug left in roi_align_kernel.cu, i am debugging it out now.
Any idea? thx
what's more, there is a problem in validate map evalution, it is always zero, isn't it? do u have same problem? if yes, i had fix it by change some files in mmdet/core/evaluation/
Line 292:
https://github.com/csuhan/ReDet/blob/0b9addf3c2734659fd6ffc7824f2e659fde4419c/mmdet/ops/riroi_align/src/riroi_align_kernel.cu#L292
Please check the annotation
first and make sure all bboxes
with valid values (especially the field angle
).
I have not meet the bug yet. Can you share your modification?