GeForce RTX 50XX - cuBLAS failed with status CUBLAS_STATUS_NOT_SUPPORTED

Open Purfview opened this issue 9 months ago • 0 comments

faster-whisper. When compute_type="auto" [it auto-selects int8_float16] then there is this error:

With cuBLAS v12.1.3.1:

File "faster_whisper\transcribe.py", line 2179, in restore_speech_timestamps
File "faster_whisper\transcribe.py", line 1471, in generate_segments
File "faster_whisper\transcribe.py", line 1719, in encode
RuntimeError: cuBLAS failed with status CUBLAS_STATUS_NOT_SUPPORTED

With cuBLAS v12.8.4.1:

 File "faster_whisper\transcribe.py", line 2179, in restore_speech_timestamps
 File "faster_whisper\transcribe.py", line 1568, in generate_segments
 File "faster_whisper\transcribe.py", line 1918, in add_word_timestamps
 File "faster_whisper\transcribe.py", line 2037, in find_alignment
RuntimeError: cuBLAS failed with status CUBLAS_STATUS_NOT_SUPPORTED

All these types are reported as supported by CTranslate2: int8 = FAIL int8_float16 = FAIL int8_float32 = FAIL int8_bfloat16 = FAIL float16 = WORKS float32 = WORKS bfloat16 = WORKS

Anyone knows what is going on with those GeForce RTX 50XX GPUs?

Tags: 5070, 5070Ti, 5080, 5090

Mar 10 '25 14:03 Purfview