ggml icon indicating copy to clipboard operation
ggml copied to clipboard

[Question] What is the status of Vulkan backend?

Open DanielMazurkiewicz opened this issue 1 year ago • 12 comments

Vulkan may be not the the best/fastest/easiest or so solution for inference, but is probably most portable GPU acceleration approach.

Is anyone working actively to add support for it? And if so what is status/progress? If not, is it planned?

DanielMazurkiewicz avatar Sep 27 '23 18:09 DanielMazurkiewicz

there are https://github.com/ggerganov/llama.cpp/pull/2059 and https://github.com/ggerganov/llama.cpp/pull/2039

Green-Sky avatar Sep 28 '23 13:09 Green-Sky

Yeah, I'm working on it. Let me know if you have any questions. It's a big project, but I'm making progress.

0cc4m avatar Sep 29 '23 21:09 0cc4m

Yeah, I'm working on it. Let me know if you have any questions. It's a big project, but I'm making progress.

Is there any low hanging fruit a newcomer to the project could help with?

Calandiel avatar Nov 16 '23 09:11 Calandiel

Yeah, I'm working on it. Let me know if you have any questions. It's a big project, but I'm making progress.

Is there any low hanging fruit a newcomer to the project could help with?

@Calandiel If you have experience with Vulkan, maybe. Otherwise probably not.

0cc4m avatar Nov 16 '23 09:11 0cc4m

Yeah, I'm working on it. Let me know if you have any questions. It's a big project, but I'm making progress.

Is there any low hanging fruit a newcomer to the project could help with?

@Calandiel If you have experience with Vulkan, maybe. Otherwise probably not.

I have. I've written vulkan based render pipelines professionally and made toy neural networks in vulkan trained with sgd. Been working with it at least in some capacity for the last 4 years or so.

Calandiel avatar Nov 16 '23 13:11 Calandiel

Yeah, I'm working on it. Let me know if you have any questions. It's a big project, but I'm making progress.

Is there any low hanging fruit a newcomer to the project could help with?

@Calandiel If you have experience with Vulkan, maybe. Otherwise probably not.

I have. I've written vulkan based render pipelines professionally and made toy neural networks in vulkan trained with sgd. Been working with it at least in some capacity for the last 4 years or so.

Oh cool, I'd be glad to work something out. If you have Discord, send me a message (_occam), otherwise send me an Email and we'll find another way.

0cc4m avatar Nov 16 '23 14:11 0cc4m

Yeah, I'm working on it. Let me know if you have any questions. It's a big project, but I'm making progress.

Is there any low hanging fruit a newcomer to the project could help with?

@Calandiel If you have experience with Vulkan, maybe. Otherwise probably not.

I have. I've written vulkan based render pipelines professionally and made toy neural networks in vulkan trained with sgd. Been working with it at least in some capacity for the last 4 years or so.

Oh cool, I'd be glad to work something out. If you have Discord, send me a message (_occam), otherwise send me an Email and we'll find another way.

Will do, see you on Discord!

Calandiel avatar Nov 17 '23 07:11 Calandiel

I think nomic-ai have functional kompute of llama.cpp right now https://github.com/nomic-ai/llama.cpp and, GPT4ALL is plenty fast on my 7900XTX via vulkan. but I am not sure how to integrate this on ggml as i am not a programmer.

sorasoras avatar Dec 10 '23 17:12 sorasoras

@sorasoras https://github.com/ggerganov/llama.cpp/pull/4456

Green-Sky avatar Dec 15 '23 15:12 Green-Sky

The vulkan and kompute backends have been merged in llama.cpp, all that is left is update the cmake build files to be able to use them in other ggml projects.

slaren avatar Jan 29 '24 23:01 slaren

The vulkan and kompute backends have been merged in llama.cpp, all that is left is update the cmake build files to be able to use them in other ggml projects.

does this affect anything using ggml? whisper.cpp stabledifussion.cpp? etc?

Kreijstal avatar Mar 04 '24 21:03 Kreijstal

@Kreijstal backends are usually upstreamed to ggml, but ggml-api consumers need to use them explicitly.

eg new-ish backend code here in ggml: https://github.com/ggerganov/ggml/blob/master/src/ggml-kompute.h https://github.com/ggerganov/ggml/blob/master/src/ggml-sycl.h https://github.com/ggerganov/ggml/blob/master/src/ggml-vulkan.h

Green-Sky avatar Mar 04 '24 22:03 Green-Sky