whisper
whisper copied to clipboard
Uses MPS (Mac acceleration) by default when available
Currently, Whisper defaults to using the CPU on MacOS devices despite the fact that PyTorch has introduced Metal Performance Shaders framework for Apple devices in the nightly release (more info).
With my changes to init.py, torch checks in MPS is available if torch.device has not been specified. If it is, and CUDA is not available, then Whisper defaults to MPS.
This way, Mac users can experience speedups from their GPU by default.
@dwarkeshsp have you measured any speedups compared to using the CPU?
Doesn't this also require switching FP16 off?
I'm getting this error when try to use MPS
/Users/diego/.pyenv/versions/3.10.6/lib/python3.10/site-packages/whisper-1.0-py3.10.egg/whisper/decoding.py:629: UserWarning: The operator 'aten::repeat_interleave.self_int' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/diego/Projects/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.) audio_features = audio_features.repeat_interleave(self.n_group, dim=0) /AppleInternal/Library/BuildRoots/2d9b4df9-4b93-11ed-b0fc-2e32217d8374/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:794: failed assertion `[MPSNDArray, initWithBuffer:descriptor:] Error: buffer is not large enough. Must be 23200 bytes ' Abort trap: 6 /Users/diego/.pyenv/versions/3.10.6/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
any clues?
@DiegoGiovany Not an expert on this but It looks like PyTorch itself is missing some operators for MPS. See for example https://github.com/pytorch/pytorch/issues/77764#issuecomment-1254352628 (which refers to repeat_interleave)
and https://github.com/pytorch/pytorch/issues/87219
Thanks for your work. I just tried this. Unfortunately, it didn't work for me on my m1 max with 32GB. Here is what I did: pip install git+https://github.com/openai/whisper.git@refs/pull/382/head
No errors on install and it works fine when run without mps: whisper audiofile_name --model medium
When I run: whisper audiofile_name --model medium --device mps
Here is the error I get:
Detecting language using up to the first 30 seconds. Use --language
to specify the language
loc("mps_multiply"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/810eba08-405a-11ed-86e9-6af958a02716/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor<1x1024x3000xf16>' and 'tensor<1xf32>' are not broadcast compatible
LLVM ERROR: Failed to infer result type(s).
When I run: whisper audiofile_name --model medium --device mps --fp16 False
Here is the error I get:
Detecting language using up to the first 30 seconds. Use --language
to specify the language
Detected language: English
/anaconda3/lib/python3.9/site-packages/whisper/decoding.py:633: UserWarning: The operator 'aten::repeat_interleave.self_int' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.)
audio_features = audio_features.repeat_interleave(self.n_group, dim=0)
/AppleInternal/Library/BuildRoots/f0468ab4-4115-11ed-8edc-7ef33c48bc85/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:794: failed assertion `[MPSNDArray, initWithBuffer:descriptor:] Error: buffer is not large enough. Must be 1007280 bytes
Basically, same error as @DiegoGiovany.
Any ideas on how to fix?
+1 for me! I'm actually using an Intel Mac with Radeon Pro 560X 4 GB...
Related https://github.com/pytorch/pytorch/issues/87351
@dwarkeshsp
not work,with mbp2015 pytorch 1.3 stable,egpu RX580, MacOS 12.3.
changed the code as the same as yours.
changed to use --device mps but show error, maybe there is still somewhere to change or modify.
use --device cpu, it works.
with other pytorch-metal project, MPS works.
What's the status on this?
I also see the same errors as others mentioned above, on an M1 Mac running arm64 Python.
On an M1 16" MBP with 16GB running MacOS 13.0.1, I'm seeing the following with openai-whisper-20230117
:
Using this command:
(venv) whisper_ai_playground % whisper './test_file.mp3' --model tiny.en --output_dir ./output --device mps
I'm encountering the following errors:
loc("mps_multiply"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/810eba08-405a-11ed-86e9-6af958a02716/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor<1x384x3000xf16>' and 'tensor<1xf32>' are not broadcast compatible
LLVM ERROR: Failed to infer result type(s).
zsh: abort whisper --model tiny.en --output_dir ./output --device mps
warnings.warn('resource_tracker: There appear to be %d '```
Is there any update on this, or did anyone figure out how to get it to work?
Same problem with osx 13.2 in MacBook Pro M2 max:
loc("mps_multiply"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/9e200cfa-7d96-11ed-886f-a23c4f261b56/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor<1x1280x3000xf16>' and 'tensor<1xf32>' are not broadcast compatible
LLVM ERROR: Failed to infer result type(s).
zsh: abort whisper audio.wav --language en --model large
m2@Render ~ % /opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
I'm getting the same error as @renderpci using the M1 Base Model
loc("mps_multiply"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/9e200cfa-7d96-11ed-886f-a23c4f261b56/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor<1x512x3000xf16>' and 'tensor<1xf32>' are not broadcast compatible
LLVM ERROR: Failed to infer result type(s).
[1] 3746 abort python3 test.py
test.py:
import whisper
model = whisper.load_model("base")
result = model.transcribe("audio.mp3")
print(result["text"])
FWIW I switched to the C++ port https://github.com/ggerganov/whisper.cpp/ and got a ~15x speedup compared to CPU pytorch on my M1 Pro. (But note that it doesn't have all the features/flags from the official whisper repo.)
FWIW I switched to the C++ port https://github.com/ggerganov/whisper.cpp/
For us whisper.cpp is not an option:
Should I use whisper.cpp in my project?
whisper.cpp is a hobby project. It does not strive to provide a production ready implementation. The main goals of the implementation is to be educational, minimalistic, portable, hackable and performant. There are no guarantees that the implementation is correct and bug-free and stuff can break at any point in the future. Support and updates will depend mostly on contributions, since with time I will move on and won't dedicate too much time on the project.
If you plan to use whisper.cpp in your own project, keep in mind the above. My advice is to not put all your eggs into the whisper.cpp basket.
The same error as @renderpci using the M2
whisper interview.mp4 --language en --model large --device mps
loc("mps_multiply"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/9e200cfa-7d96-11ed-886f-a23c4f261b56/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor<1x1280x3000xf16>' and 'tensor<1xf32>' are not broadcast compatible
LLVM ERROR: Failed to infer result type(s).
zsh: abort whisper interview.mp4 --language en --model large --device mps
pac@dd ~ % /opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
Hey @devpacdd - this should be fixed in latest pytorch nightly (pip3 install --pre --force-reinstall torch --index-url https://download.pytorch.org/whl/nightly/cpu). Let me know if you still see any issues. Thanks
Still have the same error after updating
Edit: After adding --fp16 False
to the command, I now get a new error, as well as the old one:
/opt/homebrew/lib/python3.10/site-packages/whisper/decoding.py:633: UserWarning: The operator 'aten::repeat_interleave.self_int' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.)
audio_features = audio_features.repeat_interleave(self.n_group, dim=0)
/AppleInternal/Library/BuildRoots/5b8a32f9-5db2-11ed-8aeb-7ef33c48bc85/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:794: failed assertion `[MPSNDArray, initWithBuffer:descriptor:] Error: buffer is not large enough. Must be 1007280 bytes
'
zsh: abort whisper --model large --language de --task transcribe --device mps --fp16
/opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
i was able to get it to kinda work: https://github.com/davabase/whisper_real_time/issues/5#issue-1596258783
The operator 'aten::repeat_interleave.self_int' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.) audio_features = audio_features.repeat_interleave(self.n_group, dim=0)
@manuthebyte could you please make sure you are on a recent nightly? repeat_interleave
should be natively supported. If you could try grabbing today's nightly and give a try that would be awesome! (You can get today's nightly with pip3 install --pre --force-reinstall torch==2.0.0.dev20230224 --index-url https://download.pytorch.org/whl/nightly/cpu
)
Wow!
when running:
Python3 transcribe_demo.py --model medium
(from https://github.com/davabase/whisper_real_time)
with the following packages in my pipenv's requirements.txt
certifi==2022.12.7
charset-normalizer==3.0.1
ffmpeg-python==0.2.0
filelock==3.9.0
future==0.18.3
huggingface-hub==0.12.1
idna==3.4
more-itertools==9.0.0
mpmath==1.2.1
networkx==3.0rc1
numpy==1.24.2
openai-whisper @ git+https://github.com/openai/whisper.git@51c785f7c91b8c032a1fa79c0e8f862dea81b860
packaging==23.0
Pillow==9.4.0
PyAudio==0.2.13
PyYAML==6.0
regex==2022.10.31
requests==2.28.2
SpeechRecognition==3.9.0
sympy==1.11.1
tokenizers==0.13.2
torch==2.0.0.dev20230224
torchaudio==0.13.1
torchvision==0.14.1
tqdm==4.64.1
transformers==4.26.1
typing_extensions==4.4.0
urllib3==1.26.14
it gets every word! while i was singing! in realtime, with maybe 50%~ gpu usage on the apple M2 Pro Max.
The operator 'aten::repeat_interleave.self_int' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.) audio_features = audio_features.repeat_interleave(self.n_group, dim=0)
@manuthebyte could you please make sure you are on a recent nightly?
repeat_interleave
should be natively supported. If you could try grabbing today's nightly and give a try that would be awesome! (You can get today's nightly withpip3 install --pre --force-reinstall torch==2.0.0.dev20230224 --index-url https://download.pytorch.org/whl/nightly/cpu
)
With my pip3 freeze being:
beautifulsoup4==4.11.2
certifi==2022.12.7
charset-normalizer==3.0.1
colorama==0.4.6
dnspython==2.3.0
ffmpeg-python==0.2.0
filelock==3.9.0
future==0.18.3
huggingface-hub==0.12.1
idna==3.4
more-itertools==9.0.0
mpmath==1.2.1
networkx==3.0rc1
numpy==1.24.2
openai-whisper @ git+https://github.com/openai/whisper.git@7858aa9c08d98f75575035ecd6481f462d66ca27
packaging==23.0
protobuf==4.21.12
PyYAML==6.0
regex==2022.10.31
requests==2.28.2
six==1.16.0
soupsieve==2.4
sympy==1.11.1
tokenizers==0.13.2
torch==2.0.0.dev20230224
tqdm==4.64.1
transformers==4.26.1
typing_extensions==4.4.0
urllib3==1.26.14
It now seems to use the GPU but I now get these errors:
/opt/homebrew/lib/python3.11/site-packages/whisper/decoding.py:636: UserWarning: 0MPS: no support for int64 repeats mask, casting it to int32 (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/Repeat.mm:236.)
audio_features = audio_features.repeat_interleave(self.n_group, dim=0)
/opt/homebrew/lib/python3.11/site-packages/whisper/decoding.py:443: UserWarning: 1MPS: no support for int64 reduction ops, casting it to int32 (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/ReduceOps.mm:143.)
timestamp_logprob = logprobs[k, self.tokenizer.timestamp_begin :].logsumexp(dim=-1)
/opt/homebrew/lib/python3.11/site-packages/whisper/decoding.py:444: UserWarning: 1MPS: no support for int64 min/max ops, casting it to int32 (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/mps/operations/ReduceOps.mm:1269.)
max_text_token_logprob = logprobs[k, : self.tokenizer.timestamp_begin].max()
Traceback (most recent call last):
File "/opt/homebrew/bin/whisper", line 8, in <module>
sys.exit(cli())
^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/whisper/transcribe.py", line 314, in cli
result = transcribe(model, audio_path, temperature=temperature, **args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/whisper/transcribe.py", line 183, in transcribe
result: DecodingResult = decode_with_fallback(segment)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/whisper/transcribe.py", line 118, in decode_with_fallback
decode_result = model.decode(segment, options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/whisper/decoding.py", line 707, in decode
result = DecodingTask(model, options).run(mel)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/whisper/decoding.py", line 640, in run
tokens, sum_logprobs, no_speech_probs = self._main_loop(audio_features, tokens)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/whisper/decoding.py", line 609, in _main_loop
tokens, completed = self.decoder.update(tokens, logits, sum_logprobs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/whisper/decoding.py", line 258, in update
next_tokens = Categorical(logits=logits / self.temperature).sample()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/torch/distributions/categorical.py", line 66, in __init__
super().__init__(batch_shape, validate_args=validate_args)
File "/opt/homebrew/lib/python3.11/site-packages/torch/distributions/distribution.py", line 62, in __init__
raise ValueError(
ValueError: Expected parameter logits (Tensor of shape (5, 51865)) of distribution Categorical(logits: torch.Size([5, 51865])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
tensor([[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]], device='mps:0')
When running the command whisper --model small --language en --task transcribe ***.wav --device mps
Hey @devpacdd - this should be fixed in latest pytorch nightly (pip3 install --pre --force-reinstall torch --index-url https://download.pytorch.org/whl/nightly/cpu). Let me know if you still see any issues. Thanks
Geart! it works! But.. In my test the GPU is slow than CPU... ???
Audio to transcribe: 1 minute with model large, language catalan
CPU : 2m : 33 s GPU (--device mps): 4m : 54 s
I tried with different files and the result was the same; +/- double time with GPU enable.
It's normal? I expected less time for GPU than CPU.
Best
Please support m1 GPU
I get this error while trying to use MPS
Here is the command I am running: whisper --model large --language en --task transcribe test.mp3 --device mps
$ whisper --model large --language en --task transcribe test.mp3 --device mps
Traceback (most recent call last):
File "/Users/mukul/miniconda3/envs/ml/bin/whisper", line 8, in <module>
sys.exit(cli())
File "/Users/mukul/miniconda3/envs/ml/lib/python3.10/site-packages/whisper/transcribe.py", line 433, in cli
model = load_model(model_name, device=device, download_root=model_dir)
File "/Users/mukul/miniconda3/envs/ml/lib/python3.10/site-packages/whisper/__init__.py", line 159, in load_model
return model.to(device)
File "/Users/mukul/miniconda3/envs/ml/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1170, in to
return self._apply(convert)
File "/Users/mukul/miniconda3/envs/ml/lib/python3.10/site-packages/torch/nn/modules/module.py", line 869, in _apply
self._buffers[key] = fn(buf)
File "/Users/mukul/miniconda3/envs/ml/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1168, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
NotImplementedError: Could not run 'aten::empty.memory_format' with arguments from the 'SparseMPS' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::empty.memory_format' is only available for these backends: [CPU, MPS, Meta, QuantizedCPU, QuantizedMeta, MkldnnCPU, SparseCPU, SparseMeta, SparseCsrCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMeta, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PythonDispatcher].
CPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterCPU.cpp:31085 [kernel]
MPS: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterMPS.cpp:24065 [kernel]
Meta: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterMeta.cpp:26824 [kernel]
QuantizedCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterQuantizedCPU.cpp:929 [kernel]
QuantizedMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterQuantizedMeta.cpp:105 [kernel]
MkldnnCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterMkldnnCPU.cpp:507 [kernel]
SparseCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterSparseCPU.cpp:1379 [kernel]
SparseMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterSparseMeta.cpp:249 [kernel]
SparseCsrCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterSparseCsrCPU.cpp:1128 [kernel]
BackendSelect: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterBackendSelect.cpp:734 [kernel]
Python: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:144 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:491 [backend fallback]
Functionalize: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/FunctionalizeFallbackKernel.cpp:290 [backend fallback]
Named: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
Conjugate: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ConjugateFallback.cpp:21 [kernel]
Negative: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/NegateFallback.cpp:23 [kernel]
ZeroTensor: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ZeroTensorFallback.cpp:90 [kernel]
ADInplaceOrView: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:63 [backend fallback]
AutogradOther: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradCUDA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradHIP: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradXLA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradMPS: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradIPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradXPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradHPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradVE: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradLazy: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradMTIA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradPrivateUse1: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradPrivateUse2: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradPrivateUse3: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
AutogradNestedTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:17927 [autograd kernel]
Tracer: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/TraceType_2.cpp:16872 [kernel]
AutocastCPU: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:487 [backend fallback]
AutocastCUDA: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:354 [backend fallback]
FuncTorchBatched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:815 [backend fallback]
FuncTorchVmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:28 [backend fallback]
Batched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1073 [backend fallback]
VmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:210 [backend fallback]
PythonTLSSnapshot: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:152 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:487 [backend fallback]
PythonDispatcher: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:148 [backend fallback]
Any updates on this?
pytorch/pytorch#87351
I'd love to hear a clear update too. It looks like there will be a lot of demand for this. (Mac M2 myself) Thank you OpenAI people!
@mukulpatnaik My device is M1 MacBook Pro, I got the same error with the latest version of whisper(v20230314), then I switch to v20230124, every thing works fine. (torch nightly version)
But, seems like mps is slower than cpu like @renderpci reported, for my task
- cpu 3.26 s
- mps 5.25 s
- cpu+torch2 compile 3.31 s
- mps+torch2 compile 4.94 s
🫠
@HFrost0, what's your macOS, PyTorch and Python version? Some versions support different operations, and PyTorch defaults to CPU on those.