whispercpp
whispercpp copied to clipboard
bug: Runs Exclusively on CPU
Describe the bug
This binding is about 10 times slower than native Whisper CPP because it is running exclusively on CPU on my M2 Device. Whisper CPP runs fine on its own on the GPU, so there is no reason why this should not be possible for Python bindings.
To reproduce
I ran this code:
from whispercpp import Whisper
w = Whisper.from_pretrained("large")
transcript = w.transcribe_from_file("output.wav")
I compared with whisper cpp command:
./main -f output.wav -m models/ggml-large.bin -otxt
Expected behavior
Run on GPU and 10x faster
Environment
python 3.11 MacOS Sonoma M2
Strength of whisper.cpp comes with all the back-ends it can use (especially for non-nVidia GPU users – OpenVINO, OpenCL), unfortunately none of those seems to be supported in these bindings.