monkeyplug
monkeyplug copied to clipboard
Whisper defaults to CPU instead of utilizing Nvidia GPU on Windows 11
A warning upon first running the whisper
model clued me in to it not using hardware acceleration:
UserWarning: FP16 is not supported on CPU; using FP32 instead
All I had to do in order to enable CUDA support was first uninstall Torch:
python -m pip3 uninstall torch
And reinstall with this command:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
Confirm that CUDA is available in Python by running:
import torch
torch.cuda.is_available()
monkeyplug/whisper should now correctly use your GPU to significantly speed up operations. A youtube video with a runtime of 10:42 took 13 minutes and 42 seconds to process on my CPU with the medium.en
model. After successfully enabling CUDA support, that same video took 3 minutes and 13 seconds to process on an RTX 3070. With noticeable accuracy over the default base.en
.
I caught several warning messages that were raised during a job (might be related to generating timestamps?), but they don't seem to affect the operation at all:
C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\site-packages\whisper\timing.py:42: UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower median kernel implementation... warnings.warn(
C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\site-packages\whisper\timing.py:146: UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower DTW implementation... warnings.warn(
C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\site-packages\whisper\timing.py:42: UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower median kernel implementation...` warnings.warn(
C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\site-packages\whisper\timing.py:146: UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower DTW implementation... warnings.warn(
Noticed that #3 might be in the works, which might help, but I thought it could be wise/helpful to share my findings regardless in the meantime.
PS: Whisper really is another tier of accuracy and is much appreciated.
Interesting, on my Linux machine it was using the GPU right out of the gate just with pip install openai-whisper
without any other steps on my end (double-checked with nvidia-smi during processing).
Oh and If it helps, this is a fresh install of Windows 11 and I actually used that very same command to install whisper following Python 3.12. Strange indeed.
Are you still having this issue any, I tried your steps and mine persisted.
Right now I don't have access to a Windows machine with a GPU, so I don't have any way to confirm or look into this.
Are you still having this issue any, I tried your steps and mine persisted.
Sorry to hear. It's been working just fine ever since. Could you provide more info about your setup? Operating system, whether you tried torch.cuda.is_available()
, what it returns, any error messages you might've seen, etc.
Windows 11, getting the exact same error messages as you get in your original one. I'm currently just using a separate whisper program instead so no big deal, and yes torch returns true.