monkeyplug icon indicating copy to clipboard operation
monkeyplug copied to clipboard

Whisper defaults to CPU instead of utilizing Nvidia GPU on Windows 11

Open selfAndrewKB opened this issue 1 year ago • 6 comments

A warning upon first running the whisper model clued me in to it not using hardware acceleration:

UserWarning: FP16 is not supported on CPU; using FP32 instead

All I had to do in order to enable CUDA support was first uninstall Torch: python -m pip3 uninstall torch

And reinstall with this command: pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Confirm that CUDA is available in Python by running: import torch torch.cuda.is_available()

monkeyplug/whisper should now correctly use your GPU to significantly speed up operations. A youtube video with a runtime of 10:42 took 13 minutes and 42 seconds to process on my CPU with the medium.en model. After successfully enabling CUDA support, that same video took 3 minutes and 13 seconds to process on an RTX 3070. With noticeable accuracy over the default base.en.

I caught several warning messages that were raised during a job (might be related to generating timestamps?), but they don't seem to affect the operation at all:

C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\site-packages\whisper\timing.py:42: UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower median kernel implementation... warnings.warn(

C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\site-packages\whisper\timing.py:146: UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower DTW implementation... warnings.warn(

C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\site-packages\whisper\timing.py:42: UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower median kernel implementation...` warnings.warn(

C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\site-packages\whisper\timing.py:146: UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower DTW implementation... warnings.warn(

Noticed that #3 might be in the works, which might help, but I thought it could be wise/helpful to share my findings regardless in the meantime.

PS: Whisper really is another tier of accuracy and is much appreciated.

selfAndrewKB avatar Feb 24 '24 21:02 selfAndrewKB

Interesting, on my Linux machine it was using the GPU right out of the gate just with pip install openai-whisper without any other steps on my end (double-checked with nvidia-smi during processing).

mmguero avatar Feb 25 '24 00:02 mmguero

Oh and If it helps, this is a fresh install of Windows 11 and I actually used that very same command to install whisper following Python 3.12. Strange indeed.

selfAndrewKB avatar Feb 25 '24 00:02 selfAndrewKB

Are you still having this issue any, I tried your steps and mine persisted.

bradyj04 avatar Apr 25 '24 09:04 bradyj04

Right now I don't have access to a Windows machine with a GPU, so I don't have any way to confirm or look into this.

mmguero avatar Apr 25 '24 13:04 mmguero

Are you still having this issue any, I tried your steps and mine persisted.

Sorry to hear. It's been working just fine ever since. Could you provide more info about your setup? Operating system, whether you tried torch.cuda.is_available(), what it returns, any error messages you might've seen, etc.

selfAndrewKB avatar Apr 25 '24 19:04 selfAndrewKB

Windows 11, getting the exact same error messages as you get in your original one. I'm currently just using a separate whisper program instead so no big deal, and yes torch returns true.

bradyj04 avatar Apr 25 '24 20:04 bradyj04