whisper icon indicating copy to clipboard operation
whisper copied to clipboard

Uses MPS (Mac acceleration) by default when available

Open dwarkeshsp opened this issue 2 years ago • 54 comments

Currently, Whisper defaults to using the CPU on MacOS devices despite the fact that PyTorch has introduced Metal Performance Shaders framework for Apple devices in the nightly release (more info).

With my changes to init.py, torch checks in MPS is available if torch.device has not been specified. If it is, and CUDA is not available, then Whisper defaults to MPS.

This way, Mac users can experience speedups from their GPU by default.

dwarkeshsp avatar Oct 21 '22 00:10 dwarkeshsp

@dwarkeshsp have you measured any speedups compared to using the CPU?

usergit avatar Oct 21 '22 05:10 usergit

Doesn't this also require switching FP16 off?

Michcioperz avatar Oct 21 '22 18:10 Michcioperz

I'm getting this error when try to use MPS

/Users/diego/.pyenv/versions/3.10.6/lib/python3.10/site-packages/whisper-1.0-py3.10.egg/whisper/decoding.py:629: UserWarning: The operator 'aten::repeat_interleave.self_int' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/diego/Projects/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.) audio_features = audio_features.repeat_interleave(self.n_group, dim=0) /AppleInternal/Library/BuildRoots/2d9b4df9-4b93-11ed-b0fc-2e32217d8374/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:794: failed assertion `[MPSNDArray, initWithBuffer:descriptor:] Error: buffer is not large enough. Must be 23200 bytes ' Abort trap: 6 /Users/diego/.pyenv/versions/3.10.6/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown

any clues?

DiegoGiovany avatar Nov 09 '22 12:11 DiegoGiovany

@DiegoGiovany Not an expert on this but It looks like PyTorch itself is missing some operators for MPS. See for example https://github.com/pytorch/pytorch/issues/77764#issuecomment-1254352628 (which refers to repeat_interleave)

and https://github.com/pytorch/pytorch/issues/87219

glangford avatar Nov 14 '22 19:11 glangford

Thanks for your work. I just tried this. Unfortunately, it didn't work for me on my m1 max with 32GB. Here is what I did: pip install git+https://github.com/openai/whisper.git@refs/pull/382/head

No errors on install and it works fine when run without mps: whisper audiofile_name --model medium

When I run: whisper audiofile_name --model medium --device mps

Here is the error I get: Detecting language using up to the first 30 seconds. Use --language to specify the language loc("mps_multiply"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/810eba08-405a-11ed-86e9-6af958a02716/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor<1x1024x3000xf16>' and 'tensor<1xf32>' are not broadcast compatible LLVM ERROR: Failed to infer result type(s).

When I run: whisper audiofile_name --model medium --device mps --fp16 False

Here is the error I get: Detecting language using up to the first 30 seconds. Use --language to specify the language Detected language: English /anaconda3/lib/python3.9/site-packages/whisper/decoding.py:633: UserWarning: The operator 'aten::repeat_interleave.self_int' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.) audio_features = audio_features.repeat_interleave(self.n_group, dim=0) /AppleInternal/Library/BuildRoots/f0468ab4-4115-11ed-8edc-7ef33c48bc85/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:794: failed assertion `[MPSNDArray, initWithBuffer:descriptor:] Error: buffer is not large enough. Must be 1007280 bytes

Basically, same error as @DiegoGiovany.

Any ideas on how to fix?

gltanaka avatar Nov 17 '22 18:11 gltanaka

+1 for me! I'm actually using an Intel Mac with Radeon Pro 560X 4 GB...

megeek avatar Nov 28 '22 06:11 megeek

Related https://github.com/pytorch/pytorch/issues/87351

glangford avatar Nov 28 '22 13:11 glangford

@dwarkeshsp

not work,with mbp2015 pytorch 1.3 stable,egpu RX580, MacOS 12.3.

changed the code as the same as yours.

changed to use --device mps but show error, maybe there is still somewhere to change or modify.

use --device cpu, it works.

with other pytorch-metal project, MPS works.

PhDLuffy avatar Dec 08 '22 13:12 PhDLuffy

What's the status on this?

changeling avatar Jan 16 '23 19:01 changeling

I also see the same errors as others mentioned above, on an M1 Mac running arm64 Python.

jongwook avatar Jan 18 '23 23:01 jongwook

On an M1 16" MBP with 16GB running MacOS 13.0.1, I'm seeing the following with openai-whisper-20230117:

Using this command: (venv) whisper_ai_playground % whisper './test_file.mp3' --model tiny.en --output_dir ./output --device mps

I'm encountering the following errors:

loc("mps_multiply"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/810eba08-405a-11ed-86e9-6af958a02716/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor<1x384x3000xf16>' and 'tensor<1xf32>' are not broadcast compatible

LLVM ERROR: Failed to infer result type(s).

zsh: abort whisper --model tiny.en --output_dir ./output --device mps

  warnings.warn('resource_tracker: There appear to be %d '```

changeling avatar Jan 19 '23 06:01 changeling

Is there any update on this, or did anyone figure out how to get it to work?

sachit-menon avatar Feb 04 '23 21:02 sachit-menon

Same problem with osx 13.2 in MacBook Pro M2 max:

loc("mps_multiply"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/9e200cfa-7d96-11ed-886f-a23c4f261b56/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor<1x1280x3000xf16>' and 'tensor<1xf32>' are not broadcast compatible
LLVM ERROR: Failed to infer result type(s).
zsh: abort      whisper audio.wav --language en --model large
m2@Render ~ % /opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

renderpci avatar Feb 05 '23 11:02 renderpci

I'm getting the same error as @renderpci using the M1 Base Model

loc("mps_multiply"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/9e200cfa-7d96-11ed-886f-a23c4f261b56/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor<1x512x3000xf16>' and 'tensor<1xf32>' are not broadcast compatible
LLVM ERROR: Failed to infer result type(s).
[1]    3746 abort      python3 test.py

test.py:

import whisper

model = whisper.load_model("base")
result = model.transcribe("audio.mp3")
print(result["text"])

DontEatOreo avatar Feb 06 '23 13:02 DontEatOreo

FWIW I switched to the C++ port https://github.com/ggerganov/whisper.cpp/ and got a ~15x speedup compared to CPU pytorch on my M1 Pro. (But note that it doesn't have all the features/flags from the official whisper repo.)

saurabhsharan avatar Feb 06 '23 18:02 saurabhsharan

FWIW I switched to the C++ port https://github.com/ggerganov/whisper.cpp/

For us whisper.cpp is not an option:

Should I use whisper.cpp in my project?

whisper.cpp is a hobby project. It does not strive to provide a production ready implementation. The main goals of the implementation is to be educational, minimalistic, portable, hackable and performant. There are no guarantees that the implementation is correct and bug-free and stuff can break at any point in the future. Support and updates will depend mostly on contributions, since with time I will move on and won't dedicate too much time on the project.

If you plan to use whisper.cpp in your own project, keep in mind the above. My advice is to not put all your eggs into the whisper.cpp basket.

renderpci avatar Feb 06 '23 19:02 renderpci

The same error as @renderpci using the M2

whisper interview.mp4 --language en --model large --device mps

loc("mps_multiply"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/9e200cfa-7d96-11ed-886f-a23c4f261b56/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor<1x1280x3000xf16>' and 'tensor<1xf32>' are not broadcast compatible
LLVM ERROR: Failed to infer result type(s).
zsh: abort      whisper interview.mp4 --language en --model large --device mps
pac@dd ~ % /opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

devpacdd avatar Feb 07 '23 08:02 devpacdd

Hey @devpacdd - this should be fixed in latest pytorch nightly (pip3 install --pre --force-reinstall torch --index-url https://download.pytorch.org/whl/nightly/cpu). Let me know if you still see any issues. Thanks

DenisVieriu97 avatar Feb 21 '23 08:02 DenisVieriu97

Still have the same error after updating

Edit: After adding --fp16 False to the command, I now get a new error, as well as the old one:

/opt/homebrew/lib/python3.10/site-packages/whisper/decoding.py:633: UserWarning: The operator 'aten::repeat_interleave.self_int' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.)
  audio_features = audio_features.repeat_interleave(self.n_group, dim=0)
/AppleInternal/Library/BuildRoots/5b8a32f9-5db2-11ed-8aeb-7ef33c48bc85/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:794: failed assertion `[MPSNDArray, initWithBuffer:descriptor:] Error: buffer is not large enough. Must be 1007280 bytes
'
zsh: abort      whisper --model large --language de --task transcribe  --device mps --fp16
/opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

manuthebyte avatar Feb 21 '23 10:02 manuthebyte

i was able to get it to kinda work: https://github.com/davabase/whisper_real_time/issues/5#issue-1596258783

cameronbergh avatar Feb 23 '23 05:02 cameronbergh

The operator 'aten::repeat_interleave.self_int' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.) audio_features = audio_features.repeat_interleave(self.n_group, dim=0)

@manuthebyte could you please make sure you are on a recent nightly? repeat_interleave should be natively supported. If you could try grabbing today's nightly and give a try that would be awesome! (You can get today's nightly with pip3 install --pre --force-reinstall torch==2.0.0.dev20230224 --index-url https://download.pytorch.org/whl/nightly/cpu)

DenisVieriu97 avatar Feb 24 '23 21:02 DenisVieriu97

Wow!

when running: Python3 transcribe_demo.py --model medium (from https://github.com/davabase/whisper_real_time)

with the following packages in my pipenv's requirements.txt

certifi==2022.12.7
charset-normalizer==3.0.1
ffmpeg-python==0.2.0
filelock==3.9.0
future==0.18.3
huggingface-hub==0.12.1
idna==3.4
more-itertools==9.0.0
mpmath==1.2.1
networkx==3.0rc1
numpy==1.24.2
openai-whisper @ git+https://github.com/openai/whisper.git@51c785f7c91b8c032a1fa79c0e8f862dea81b860
packaging==23.0
Pillow==9.4.0
PyAudio==0.2.13
PyYAML==6.0
regex==2022.10.31
requests==2.28.2
SpeechRecognition==3.9.0
sympy==1.11.1
tokenizers==0.13.2
torch==2.0.0.dev20230224
torchaudio==0.13.1
torchvision==0.14.1
tqdm==4.64.1
transformers==4.26.1
typing_extensions==4.4.0
urllib3==1.26.14

it gets every word! while i was singing! in realtime, with maybe 50%~ gpu usage on the apple M2 Pro Max.

cameronbergh avatar Feb 25 '23 01:02 cameronbergh

The operator 'aten::repeat_interleave.self_int' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.) audio_features = audio_features.repeat_interleave(self.n_group, dim=0)

@manuthebyte could you please make sure you are on a recent nightly? repeat_interleave should be natively supported. If you could try grabbing today's nightly and give a try that would be awesome! (You can get today's nightly with pip3 install --pre --force-reinstall torch==2.0.0.dev20230224 --index-url https://download.pytorch.org/whl/nightly/cpu)

With my pip3 freeze being:

beautifulsoup4==4.11.2
certifi==2022.12.7
charset-normalizer==3.0.1
colorama==0.4.6
dnspython==2.3.0
ffmpeg-python==0.2.0
filelock==3.9.0
future==0.18.3
huggingface-hub==0.12.1
idna==3.4
more-itertools==9.0.0
mpmath==1.2.1
networkx==3.0rc1
numpy==1.24.2
openai-whisper @ git+https://github.com/openai/whisper.git@7858aa9c08d98f75575035ecd6481f462d66ca27
packaging==23.0
protobuf==4.21.12
PyYAML==6.0
regex==2022.10.31
requests==2.28.2
six==1.16.0
soupsieve==2.4
sympy==1.11.1
tokenizers==0.13.2
torch==2.0.0.dev20230224
tqdm==4.64.1
transformers==4.26.1
typing_extensions==4.4.0
urllib3==1.26.14

It now seems to use the GPU but I now get these errors:

/opt/homebrew/lib/python3.11/site-packages/whisper/decoding.py:636: UserWarning: 0MPS: no support for int64 repeats mask, casting it to int32 (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/Repeat.mm:236.)
  audio_features = audio_features.repeat_interleave(self.n_group, dim=0)
/opt/homebrew/lib/python3.11/site-packages/whisper/decoding.py:443: UserWarning: 1MPS: no support for int64 reduction ops, casting it to int32 (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/ReduceOps.mm:143.)
  timestamp_logprob = logprobs[k, self.tokenizer.timestamp_begin :].logsumexp(dim=-1)
/opt/homebrew/lib/python3.11/site-packages/whisper/decoding.py:444: UserWarning: 1MPS: no support for int64 min/max ops, casting it to int32 (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/ReduceOps.mm:1269.)
  max_text_token_logprob = logprobs[k, : self.tokenizer.timestamp_begin].max()
Traceback (most recent call last):
  File "/opt/homebrew/bin/whisper", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/whisper/transcribe.py", line 314, in cli
    result = transcribe(model, audio_path, temperature=temperature, **args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/whisper/transcribe.py", line 183, in transcribe
    result: DecodingResult = decode_with_fallback(segment)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/whisper/transcribe.py", line 118, in decode_with_fallback
    decode_result = model.decode(segment, options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/whisper/decoding.py", line 707, in decode
    result = DecodingTask(model, options).run(mel)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/whisper/decoding.py", line 640, in run
    tokens, sum_logprobs, no_speech_probs = self._main_loop(audio_features, tokens)
                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/whisper/decoding.py", line 609, in _main_loop
    tokens, completed = self.decoder.update(tokens, logits, sum_logprobs)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/whisper/decoding.py", line 258, in update
    next_tokens = Categorical(logits=logits / self.temperature).sample()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/torch/distributions/categorical.py", line 66, in __init__
    super().__init__(batch_shape, validate_args=validate_args)
  File "/opt/homebrew/lib/python3.11/site-packages/torch/distributions/distribution.py", line 62, in __init__
    raise ValueError(
ValueError: Expected parameter logits (Tensor of shape (5, 51865)) of distribution Categorical(logits: torch.Size([5, 51865])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
tensor([[nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan]], device='mps:0')

When running the command whisper --model small --language en --task transcribe ***.wav --device mps

manuthebyte avatar Feb 25 '23 09:02 manuthebyte

Hey @devpacdd - this should be fixed in latest pytorch nightly (pip3 install --pre --force-reinstall torch --index-url https://download.pytorch.org/whl/nightly/cpu). Let me know if you still see any issues. Thanks

Geart! it works! But.. In my test the GPU is slow than CPU... ???

Audio to transcribe: 1 minute with model large, language catalan

CPU : 2m : 33 s GPU (--device mps): 4m : 54 s

I tried with different files and the result was the same; +/- double time with GPU enable.

It's normal? I expected less time for GPU than CPU.

Best

renderpci avatar Feb 27 '23 16:02 renderpci

Please support m1 GPU

0x1FFFFF avatar Mar 05 '23 07:03 0x1FFFFF

I get this error while trying to use MPS

Here is the command I am running: whisper --model large --language en --task transcribe test.mp3 --device mps

$ whisper --model large --language en --task transcribe test.mp3 --device mps
Traceback (most recent call last):
  File "/Users/mukul/miniconda3/envs/ml/bin/whisper", line 8, in <module>
    sys.exit(cli())
  File "/Users/mukul/miniconda3/envs/ml/lib/python3.10/site-packages/whisper/transcribe.py", line 433, in cli
    model = load_model(model_name, device=device, download_root=model_dir)
  File "/Users/mukul/miniconda3/envs/ml/lib/python3.10/site-packages/whisper/__init__.py", line 159, in load_model
    return model.to(device)
  File "/Users/mukul/miniconda3/envs/ml/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1170, in to
    return self._apply(convert)
  File "/Users/mukul/miniconda3/envs/ml/lib/python3.10/site-packages/torch/nn/modules/module.py", line 869, in _apply
    self._buffers[key] = fn(buf)
  File "/Users/mukul/miniconda3/envs/ml/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1168, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
NotImplementedError: Could not run 'aten::empty.memory_format' with arguments from the 'SparseMPS' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::empty.memory_format' is only available for these backends: [CPU, MPS, Meta, QuantizedCPU, QuantizedMeta, MkldnnCPU, SparseCPU, SparseMeta, SparseCsrCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMeta, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PythonDispatcher].

CPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterCPU.cpp:31085 [kernel]
MPS: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterMPS.cpp:24065 [kernel]
Meta: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterMeta.cpp:26824 [kernel]
QuantizedCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterQuantizedCPU.cpp:929 [kernel]
QuantizedMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterQuantizedMeta.cpp:105 [kernel]
MkldnnCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterMkldnnCPU.cpp:507 [kernel]
SparseCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterSparseCPU.cpp:1379 [kernel]
SparseMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterSparseMeta.cpp:249 [kernel]
SparseCsrCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterSparseCsrCPU.cpp:1128 [kernel]
BackendSelect: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterBackendSelect.cpp:734 [kernel]
Python: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:144 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:491 [backend fallback]
Functionalize: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/FunctionalizeFallbackKernel.cpp:290 [backend fallback]
Named: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
Conjugate: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ConjugateFallback.cpp:21 [kernel]
Negative: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/NegateFallback.cpp:23 [kernel]
ZeroTensor: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ZeroTensorFallback.cpp:90 [kernel]
ADInplaceOrView: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:63 [backend fallback]
AutogradOther: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradCUDA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradHIP: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradXLA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradMPS: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradIPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradXPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradHPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradVE: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradLazy: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradMTIA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradPrivateUse1: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradPrivateUse2: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradPrivateUse3: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradNestedTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
Tracer: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/TraceType_2.cpp:16872 [kernel]
AutocastCPU: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:487 [backend fallback]
AutocastCUDA: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:354 [backend fallback]
FuncTorchBatched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:815 [backend fallback]
FuncTorchVmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:28 [backend fallback]
Batched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1073 [backend fallback]
VmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:210 [backend fallback]
PythonTLSSnapshot: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:152 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:487 [backend fallback]
PythonDispatcher: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:148 [backend fallback]

mukulpatnaik avatar Mar 08 '23 12:03 mukulpatnaik

Any updates on this?

nicolopadovan avatar Mar 17 '23 10:03 nicolopadovan

pytorch/pytorch#87351

I'd love to hear a clear update too. It looks like there will be a lot of demand for this. (Mac M2 myself) Thank you OpenAI people!

andrewstec avatar Mar 18 '23 14:03 andrewstec

@mukulpatnaik My device is M1 MacBook Pro, I got the same error with the latest version of whisper(v20230314), then I switch to v20230124, every thing works fine. (torch nightly version)

But, seems like mps is slower than cpu like @renderpci reported, for my task

  • cpu 3.26 s
  • mps 5.25 s
  • cpu+torch2 compile 3.31 s
  • mps+torch2 compile 4.94 s

🫠

HFrost0 avatar Mar 20 '23 10:03 HFrost0

@HFrost0, what's your macOS, PyTorch and Python version? Some versions support different operations, and PyTorch defaults to CPU on those.

oyarsa avatar Mar 20 '23 12:03 oyarsa