Apple Silicon (MLX/CoreML) backend

Open solarkraft opened this issue 5 months ago • 1 comments

WhisperLive looks like a great building block for a local voice assistant!

A lot of people interested in local AI applications (including myself) run Apple hardware due to its great price/performance in this area. Ironically faster_whisper is more like slower_whisper on M-series processors (M1/M2/M3/M4) because CTranslate2 isn't optimized falling back to the CPU.

The original ("slow") whisper.cpp has an MLX backend now, which performs better than faster_whisper on Apple processors. It would be great for it to be available to achieve more practical performance on Apple hardware.

(that said: the tiny model does achieve real time performance on M1!)

There are some other interesting options (though I'm not sure whether they fit your concept of a "backend") as well: WhisperX (which has an MLX version upcoming) supports batched inference, which, from my understanding of the process (simultaneous processing of overlapping audio snippets), could lead to further performance gains.

Aug 11 '25 12:08 solarkraft

Please implement this! This would be awesome.

Aug 13 '25 19:08 Explosion-Scratch