mmcv
mmcv copied to clipboard
Training with GPU -- RuntimeError: roi_align_forward_impl:
Prerequisite
- [X] I have searched Issues and Discussions but cannot get the expected help.
- [X] The bug has not been fixed in the latest version(https://github.com/open-mmlab/mmcv).
Environment
I'm training the AVA dataset for spatio-temporal activity detection. But it's not taking any gpu while I've 2 gpuspresent in my machine.
However it's supposed to take gpu by default but which is not happening in the latest mmcv version. If I enable gpu with the CUDA_VISIBLE_DEVICES=0,1
environment variable, I'm getting this error.
File "/home/soumyadeep/mmaction_custom/mmaction2_v1.0/mmcv/mmcv/ops/roi_align.py", line 90, in forward
ext_module.roi_align_forward(
RuntimeError: roi_align_forward_impl: implementation for device cuda:0 not found.
Reproduces the problem - code sample
Reproduces the problem - command or script
CUDA_VISIBLE_DEVICES=0,1 bash tools/dist_train.sh /home/soumyadeep/mmaction_custom/mmaction2_v1.0/configs/detection/slowfast/slowfast_kinetics400-pretrained-r50_8xb16-4x16x1-20e_ava21-rgb.py 2
Reproduces the problem - error message
File "/home/soumyadeep/miniconda3/envs/openmmlab2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/soumyadeep/mmaction_custom/mmaction2_v1.0/mmaction/models/roi_heads/roi_extractors/single_straight3d.py", line 122, in forward
roi_feat = self.roi_layer(frame_feat, rois)
File "/home/soumyadeep/miniconda3/envs/openmmlab2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/soumyadeep/mmaction_custom/mmaction2_v1.0/mmcv/mmcv/ops/roi_align.py", line 210, in forward
return roi_align(input, rois, self.output_size, self.spatial_scale,
File "/home/soumyadeep/miniconda3/envs/openmmlab2/lib/python3.8/site-packages/torch/autograd/function.py", line 506, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/home/soumyadeep/mmaction_custom/mmaction2_v1.0/mmcv/mmcv/ops/roi_align.py", line 90, in forward
ext_module.roi_align_forward(
RuntimeError: roi_align_forward_impl: implementation for device cuda:0 not found.
Additional information
No response
Hi @soumyadbanik , it maybe mmcv was not installed with cuda op support.