faster-whisper
faster-whisper copied to clipboard
It does not work normally on the RTX 5070 TI.
The RTX 5070 TI encountered a RuntimeError: cuBLAS failed with status CUBLAS_STATUS_NOT_SUPPORTED. When the --compute_type float32 option was added, it could run, but long videos did not work properly. In videos longer than 2 hours, the program only worked for the first 33 minutes and then abruptly stopped.
Check what is your cuBLAS version or what CUDA Toolkit version you have installed?
Check what is your cuBLAS version or what CUDA Toolkit version you have installed?
I am not a specialist in this area of technology. I have not independently installed cBLAS and CUDA Toolkit myself. I use PotPlayer to generate subtitles using this model on my laptop with an RTX 3060, which works fine. However, when I try the same process on my RTX 5070 Ti, it encounters issues.
RTX 5070
Don't know why compute_type auto [int8] doesn't work with these GPUs, use --compute_type float16
I use PotPlayer to generate subtitles using this model
Then you are in the wrong repo, go there: https://github.com/Purfview/whisper-standalone-win
I also can't use it with my 5070ti. Basically, all 50 series cards are unusable.
I also can't use it with my 5070ti. Basically, all 50 series cards are unusable.
Use proper settings.
RTX 5070
Don't know why compute_type
auto[int8] doesn't work with these GPUs, use--compute_type float16I use PotPlayer to generate subtitles using this model
Then you are in the wrong repo, go there: https://github.com/Purfview/whisper-standalone-win
Thanks for this, just helped me as well, can confirm did not work on auto or int8 but did work on float16.
@teddybear082 @ictsmc, do you have this issue when using Python and this repo?
@teddybear082 @ictsmc, do you have this issue when using Python and this repo?
I’m using faster whisper python library via WingmanAI by ShipBit: https://github.com/ShipBit/wingman-ai. They use pyinstaller to turn the python into an exe I believe and faster-whisper is one of the dependencies.
So not the Python directly. Kinda strange that in my repo I have lots of reports about this, but I don't see any reports in Python repos.
Btw, similar reports about pyannote and 50xxx GPUs, but none in pyannote repo too.
So not the Python directly. Kinda strange that in my repo I have lots of reports about this, but I don't see any reports in Python repos.
Btw, similar reports about pyannote and 50xxx GPUs, but none in pyannote repo too.
What do you mean not the python directly? Isn't this the repo for the faster-whisper pipy python project? Wingman depends on faster-whisper=1.1.1 python library I believe. I may just be confusing what you mean.
I meant using Python directly, not the exe compiled with pyinstaller. And it strange that all reports about 50xx comes only from "exe" repos.
Maybe because the original default for compute_type is not "auto", I don't remember now, it could be "default".
Here is link to the issue at CTranslate2: https://github.com/OpenNMT/CTranslate2/issues/1865
same issue on my 5070 Ti. Is there anyway to force it use CPU rather than GPU?
Hi everyone,
I’m running into the same issue on an RTX 5070 Ti, and oddly I see better performance on an RTX 2070 SUPER. Here’s what I’ve observed:
RTX 5070 Ti
With compute_type='int8_float16', I get:
RuntimeError: cuBLAS failed with status CUBLAS_STATUS_NOT_SUPPORTED
Switching to compute_type='float16' works, but my transcription speed is only around 21× real‑time.
RTX 2070 SUPER
I can use compute_type='int8_float16' without errors and achieve about 86× real‑time speed.
any solution were found here?
any solution were found here?
Yes.