whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

Metal support

Open ggerganov opened this issue 1 year ago • 7 comments

This is quick and dirty implementation of GPU support for Apple hardware using Metal Performance Shaders. It demonstrates how part of the feed forward layer in the encoder can be offloaded to the GPU.

On my MacBook M1 Pro, I don't observe significant performance gain compared to the original implementation. Either I have a problem in my MPS integration, or simply the AMX coprocessor is doing a good enough job and adding Metal does not really help.

In any case, this PR can be a good starting point for anyone interested in adding GPU support to ggml. I think a similar approach can be taken for CUDA.

For now, I don't plan to merge this into master unless the performance gets better.

ggerganov avatar Nov 07 '22 19:11 ggerganov

can't make it on M1 Max:

c++ -I. -I./examples -O3 -std=c++11 -pthread examples/main/main.cpp whisper.o ggml.o -o main -framework Accelerate Undefined symbols for architecture arm64: "_ggml_mtl_alloc", referenced from: _ggml_new_tensor_mtl_impl in ggml.o "_ggml_mtl_init", referenced from: _ggml_init in ggml.o "_ggml_mtl_mul_mat_f16", referenced from: _ggml_compute_forward_mul_mat_f16_f32 in ggml.o ld: symbol(s) not found for architecture arm64 clang: error: linker command failed with exit code 1 (use -v to see invocation) make: *** [main] Error 1

DiegoGiovany avatar Nov 11 '22 23:11 DiegoGiovany

@DiegoGiovany Forgot to update the Makefile - it should work now. make clean + make

ggerganov avatar Nov 12 '22 06:11 ggerganov