stable-diffusion.cpp [Feature] NPU support for backend apart from the existing cpu, vulkan, opencl etc.

Feature Summary

NPU support for the backend

Detailed Description

[Feature] NPU support for backend apart from the existing cpu, vulkan, opencl etc.

Alternatives you considered

No response

Additional context

No response

Dec 03 '25 21:12 officiallyutso

I suggest you open a request on the llama.cpp project, since ggml (our backend library) is developed primarily there.

Dec 03 '25 21:12 wbruna

https://github.com/FastFlowLM/FastFlowLM

As far as I know, they are the only ones with a well-optimized inference engine for NPUs (specifically XDNA2), and it’s impressive what they achieve within a TDP below 2 W. It makes me think that AMD should focus resources on ASICs and the software ecosystem for them.

Dec 03 '25 22:12 JohnLoveJoy

Hey, would love to have a help out. I was working on a android app to have a local quantized sdxl as a model to run.

I tried wirh stable-diffusion.cpp, however due to it not having npu support it takes a huge amount of time. I am working on a exynos samsung processor, could you suggest any other alternative which i can use. Anyways thanks for the reply.

Dec 04 '25 02:12 officiallyutso

MNN has acceleration in NPU but only for the latest Snapdragon, I think. You should research Samsung's documentation if you plan to study how to implement i

Dec 04 '25 12:12 GreenShadows