unsupervised-deep-homography RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`

Open fanzhen12 opened this issue 2 years ago • 3 comments

what is this? I turned down batch_ size, but this error also occurs.

Apr 12 '22 06:04 fanzhen12

Hi @fanzhen12 can you please send the full stack trace, along with your pytorch and CUDA version?

Apr 12 '22 22:04 teddykoker

Thanks for your reply, I'm sorry for my late reply. Now let me elaborate on the problems I have encountered.

I download the repo, after I download it and put COCO dataset in it, the structure of the file is like this:
As you can see, test2014, train2014 and val2014 are three parts of the coco dataset. I didn't change any part of code, and then, I run the code: I put "python train.py ./train2014/ ./val2014/" into the terminal,like this: and then, the error appears: "RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches)"
I suspect that the graphics card memory is not enough, so I reduced the bathch_ Size, and the network model is shown in the figure: At runtime, the following outputs exist： So I guess that if the memory is really insufficient, this situation should not be caused by the large network model. the GPU situation is as follows: These graphics cards should have enough memory. These are my thoughts on this mistake, or maybe I'm going in the wrong direction, the cause of this problem is not insufficient memory at all, but other reasons. I look forward to you early reply. thank you very much. by the way, my pytorch and CUDA version is: torch 1.8.0+cu111 @teddykoker

Apr 13 '22 06:04 fanzhen12

I suspect the CUBLAS_STATUS_EXECUTION_FAILED error is not an out-of-memory issue, but probably some sort of driver issue. Maybe try installing torch+cu113: pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113, which would be closer to the version of CUDA you have on your machine.

Have you been successful in running any other pytorch code on that machine? It seems like this issue would likely be non-specific for this code base.

Apr 13 '22 13:04 teddykoker

Closing due to inactivity

Feb 23 '23 22:02 teddykoker

unsupervised-deep-homography unsupervised-deep-homography copied to clipboard

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`

unsupervised-deep-homography
unsupervised-deep-homography copied to clipboard