CTranslate2
CTranslate2 copied to clipboard
GeForce RTX 50XX - cuBLAS failed with status CUBLAS_STATUS_NOT_SUPPORTED
faster-whisper. When compute_type="auto" [it auto-selects int8_float16] then there is this error:
With cuBLAS v12.1.3.1:
File "faster_whisper\transcribe.py", line 2179, in restore_speech_timestamps
File "faster_whisper\transcribe.py", line 1471, in generate_segments
File "faster_whisper\transcribe.py", line 1719, in encode
RuntimeError: cuBLAS failed with status CUBLAS_STATUS_NOT_SUPPORTED
With cuBLAS v12.8.4.1:
File "faster_whisper\transcribe.py", line 2179, in restore_speech_timestamps
File "faster_whisper\transcribe.py", line 1568, in generate_segments
File "faster_whisper\transcribe.py", line 1918, in add_word_timestamps
File "faster_whisper\transcribe.py", line 2037, in find_alignment
RuntimeError: cuBLAS failed with status CUBLAS_STATUS_NOT_SUPPORTED
All these types are reported as supported by CTranslate2: int8 = FAIL int8_float16 = FAIL int8_float32 = FAIL int8_bfloat16 = FAIL float16 = WORKS float32 = WORKS bfloat16 = WORKS
Anyone knows what is going on with those GeForce RTX 50XX GPUs?
Tags: 5070, 5070Ti, 5080, 5090