whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

Can't disable gpu

Open thewh1teagle opened this issue 6 months ago • 12 comments

whisper_context_params.use_gpu = false; doesn't work. it still trying to use opencl and leads to crash (specific in my case with opencl)

I use it in my project vibe And this option very important because I want to give users the best possible speed with GPU but fallback in case of error.

thewh1teagle avatar Jan 13 '24 04:01 thewh1teagle

Hi @slaren , is there a way to completely turn off OpenCL during runtime? Thanks!

bobqianic avatar Jan 13 '24 11:01 bobqianic

Currently, there is no way to disable the GPU completely when the project is built with OpenCL support. Will think about fixing this.

In the meantime, does the information from https://github.com/ggerganov/whisper.cpp/issues/888 help in anyway?

ggerganov avatar Jan 13 '24 11:01 ggerganov

@ggerganov It doesn't help. currenly I use openBlas so at least the performance is much better than without. Looking to improve it with the project vibe to get the best possible

thewh1teagle avatar Jan 14 '24 23:01 thewh1teagle

@ggerganov I am also trying to turn off GPU use to allow for background processing on the iphone. Apologies if this is obvious, but is it possible for me to turn off the OpenCL support so that I can turn off the GPU use?

chuck-fyn avatar Jan 17 '24 07:01 chuck-fyn

You can easily update ggml.c to avoid all GPU calls (CUDA, OpenCL, etc.) if a global flag is set. For example here:

https://github.com/ggerganov/whisper.cpp/blob/1f50a7d29f85f221368e81201780e0c8dd631076/ggml.c#L9816-L9825

You can add a void ggml_gpu_set(bool enable); call that sets a global boolean flag and check the flag before each GPU call in ggml.c.

This is currently not officially supported in ggml because I want to figure out a better API. But for quick workaround, I think this is the only option atm.

ggerganov avatar Jan 17 '24 19:01 ggerganov

@ggerganov I think that eventually it will be useful having is_avaibale() function for each gpu method (cuda, coreml etc)

thewh1teagle avatar Jan 17 '24 19:01 thewh1teagle

@ggerganov Can we somehow get is_available() functions per each GPU platform? so we can easily decide which to use? I just added coreml support for vibe app and the performance incredible. (20x faster and even more)

Also about the option for disable gpu using use_gpu = false, do you have any progress / plans about it? I'm eager to add support for GPU for Linux and Windows as well.

thewh1teagle avatar Jan 22 '24 14:01 thewh1teagle

hi, same issue on linux with a cuda build: still seems to init and use the cuda gpu despite the '-ng' cli argument:

./cuda/main -m ggml-base.en.bin -f samples/jfk.wav -ng
...
ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: Quadro RTX 3000, compute capability 7.5, VMM: yes
...

best

WilliamTambellini avatar Mar 28 '24 16:03 WilliamTambellini

@WilliamTambellini this no longer happens with the CUDA backend after the sync with ggml from yesterday.

slaren avatar Mar 28 '24 17:03 slaren

Tks @slaren Superb, I will pull and rebuild and retest. Congrats.

WilliamTambellini avatar Mar 28 '24 18:03 WilliamTambellini

Tks @slaren @ggerganov 1.5.4 is already few months old, from Jan 5th. Would you mind doing a new release? Best

WilliamTambellini avatar Apr 02 '24 16:04 WilliamTambellini

I'll probably make a new one soon, yes

ggerganov avatar Apr 09 '24 15:04 ggerganov