faster-whisper ValueError: unsupported device mps | Trying to run my faster-whisper application using "mps" on my M2 Mac

So faster-whisper is built using CTranslate2 and checking the CTranslate2 github, they say:

"Multiple CPU architectures support The project supports x86-64 and AArch64/ARM64 processors and integrates multiple backends that are optimized for these platforms: Intel MKL, oneDNN, OpenBLAS, Ruy, and Apple Accelerate."

Source: https://github.com/OpenNMT/CTranslate2

Which lead me to believe I would be able to use my MPS device with faster-whisper, but unfortunately when I tried it gave me this error:

Traceback (most recent call last):
  File "/Users/jack/Desktop/landoff_faster/app.py", line 14, in <module>
    model = WhisperModel(model_size, device="mps", compute_type="float32")
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/faster_whisper/transcribe.py", line 145, in __init__
    self.model = ctranslate2.models.Whisper(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: unsupported device mps

I did set it to cpu though, and it did work, but I don't think that is using the GPU cores for more efficient and faster inference.

I guess I would just like to know if there is any native support for using faster-whisper on Macs, and also I'd love to know what I can expect when running faster-whisper on my iPhone 14 Pro Max when I'm deploying my application. Will it efficently use the A16 Bionic chip inside of it via Apple Accelerate?

Maybe I have it wrong, and Apple Accelerate is merely for the CPU and doesn't affect the ARM based M-Chips in A-Chips. Then if that's the case, does anyone know how I can get inference working with the Apple Neural Engine or CoreML?

Any help is appreciated.

Cheers, Jack

Jul 16 '24 10:07 jack-tol

So faster-whisper is built using CTranslate2 and checking the CTranslate2 github, they say:

"Multiple CPU architectures support

The project supports x86-64 and AArch64/ARM64 processors and integrates multiple backends that are optimized for these platforms: Intel MKL, oneDNN, OpenBLAS, Ruy, and Apple Accelerate."

Source: https://github.com/OpenNMT/CTranslate2

Which lead me to believe I would be able to use my MPS device with faster-whisper, but unfortunately when I tried it gave me this error:
Traceback (most recent call last):

  File "/Users/jack/Desktop/landoff_faster/app.py", line 14, in <module>

    model = WhisperModel(model_size, device="mps", compute_type="float32")

            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/faster_whisper/transcribe.py", line 145, in __init__

    self.model = ctranslate2.models.Whisper(

                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^

ValueError: unsupported device mps
I did set it to cpu though, and it did work, but I don't think that is using the GPU cores for more efficient and faster inference.

I guess I would just like to know if there is any native support for using faster-whisper on Macs, and also I'd love to know what I can expect when running faster-whisper on my iPhone 14 Pro Max when I'm deploying my application. Will it efficently use the A16 Bionic chip inside of it via Apple Accelerate?

Maybe I have it wrong, and Apple Accelerate is merely for the CPU and doesn't affect the ARM based M-Chips in A-Chips. Then if that's the case, does anyone know how I can get inference working with the Apple Neural Engine or CoreML?

Any help is appreciated.

Cheers,

Jack

I be able to work on getting support, but I do not own a Mac so if you're willing to commit hours of benchmarking and testing to make sure my code is correct, I would devote the time. I'm not trying to exaggerate. I'm just emphasizing that the change would require multiple hours of benchmarking and testing.