[SAMPLE]I want to use NPU&Hi4 mini for generating equations for inference.

Open nnbw-liu opened this issue 6 months ago • 1 comments

My PC is a Qualcomm processor, and I need to use Phi4 mini for inference. There is a mandatory requirement to use QNN (NPU). I found an example using Phi4 mini&CPU in the example, but what I need is NPU. Can you help provide an example? Actually, I have seen a parameter (provider) that can pass "QNN". I forced "provider" to be set to QNN, but my model using CPU seems to have no effect. Do I need to convert the model again?

cancellationToken.ThrowIfCancellationRequested();
var config = new Config(modelDir);
//if (!string.IsNullOrEmpty(provider))
//{
//    config.AppendProvider(provider);
//}
config.AppendProvider("qnn");
chatClient = new OnnxRuntimeGenAIChatClient(config, true, options);
cancellationToken.ThrowIfCancellationRequested();

Jun 27 '25 09:06 nnbw-liu

Hi. To run the model on the npu, you will need a model that is built for the npu (qnn in this case). There is an NPU model here you can try https://huggingface.co/microsoft/Phi-4-mini-reasoning-onnx. However, I just tried it myself and ran into a similar bug like this https://github.com/microsoft/Foundry-Local/issues/67 which I'm told it's being worked on and there should be a fix soon.

I'll keep this issue open as I think we should include this model as a recommended model in the gallery but I will wait to do that until this bug is resolved.

Jun 28 '25 00:06 nmetulev