stable-diffusion.cpp icon indicating copy to clipboard operation
stable-diffusion.cpp copied to clipboard

slow ggml_vec_dot_f16 operator on Android

Open Jimskns opened this issue 11 months ago • 1 comments

Hi, @leejet I compiled this project with clblast support and run sd on my Android phone. It runs successfully, however it's quite slow, about 70s per iter. And I profile it with perf, convert the output to flame graph and I found that the ggml_vec_dot_f16 accounts for over 80% of the runtime. Does this op support the adreno gpu acceleration? What's the reason behind this? SD-perf

Thanks a lot~

Jimskns avatar Mar 04 '24 11:03 Jimskns

I think it would be better to support Vulkan backend for acceleration on Android devices, as ggml currently lacks good support for OpenCL (it is even considered obsolete). Unfortunately, I don't know much about Vulkan to implement the kernels of the operations (I started watching some videos a few weeks ago because I want to stop using OpenGL).

FSSRepo avatar Mar 04 '24 13:03 FSSRepo