CDN
CDN copied to clipboard
CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm`
Getting the error, RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling 'cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)'
while trying the following training command:
python -m torch.distributed.launch \
--nproc_per_node=1 \
--use_env \
main.py \
--pretrained params/detr-r50-pre-2stage-q64.pth \
--output_dir logs \
--dataset_file hico \
--hoi_path data/hico_20160224_det \
--num_obj_classes 80 \
--num_verb_classes 117 \
--backbone resnet50 \
--num_queries 64 \
--dec_layers_hopd 3 \
--dec_layers_interaction 3 \
--epochs 90 \
--lr_drop 60 \
--use_nms_filter
I am using python 3.7, CUDA 10.1.
Could you provide more details, e.g., where/which line does this error occur in the project?
The error occurred in this line.
I am getting the same issue. Tried different cuda versions no luck. Were you @Lopa07 able to fix it? Thanks!
I was able to fix this by using cuda-toolkit 10.2 with cudnn8.7 for cuda10.2 (https://developer.nvidia.com/rdp/cudnn-archive#a-collapse870-102). Hope this helps. @Lopa07
The following version works as well as long as the appropriate cudann version is installed.
conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=11.0 -c pytorch
This guide is helpful for cudann installation.