Foundry-Local icon indicating copy to clipboard operation
Foundry-Local copied to clipboard

Issues on Intel NPU and OpenVINO EP

Open 952313 opened this issue 3 weeks ago • 3 comments

Problems on Intel NPU and Foundry Local

##Background I send this in discussion too.And i ask AI about this, AI think this is an issue, but not my fault, so I send it to issue.


Hi there! I'm trying to use Intel NPU and Foundry Local My NPU has already installed driver successfully.

image image
image

foundry model run DeepSeek-R1-Distill-Qwen-7B-openvino-npu:1


I tried to run it, but

C:\Users\(hidden)\.foundry>foundry model run DeepSeek-R1-Distill-Qwen-7B-openvino-npu:1
Exception: ❌ Model <DeepSeek-R1-Distill-Qwen-7B-openvino-npu:1> was not found in the catalog or local cache.
🔍 Did you mean: `foundry model run deepseek-r1-distill-qwen-7b-generic-gpu:3`?
👉 Try `foundry model list` to see available models.
👉 Or check for typos in the model name.
[12:37:58 ERR] LogException
Microsoft.AI.Foundry.Local.Common.FLException: ❌ Model <DeepSeek-R1-Distill-Qwen-7B-openvino-npu:1> was not found in the catalog or local cache.
🔍 Did you mean: `foundry model run deepseek-r1-distill-qwen-7b-generic-gpu:3`?
👉 Try `foundry model list` to see available models.
👉 Or check for typos in the model name.
   at Microsoft.AI.Foundry.Local.Catalog.AzureFoundryCatalog.<SuggestFuzzyMatch>d__11.MoveNext() + 0xd1c
--- End of stack trace from previous location ---
   at Microsoft.AI.Foundry.Local.Common.ModelManagement.<DownloadModelAsync>d__9.MoveNext() + 0x383
--- End of stack trace from previous location ---
   at Microsoft.AI.Foundry.Local.Commands.ModelRunCommand.<<Create>b__1_0>d.MoveNext() + 0x5d6
--- End of stack trace from previous location ---
   at Microsoft.AI.Foundry.Local.Common.CommandActionFactory.<>c__DisplayClass0_0`1.<<Create>b__0>d.MoveNext() + 0x1e7
--- End of stack trace from previous location ---
   at System.CommandLine.NamingConventionBinder.CommandHandler.<GetExitCodeAsync>d__66.MoveNext() + 0x50
--- End of stack trace from previous location ---
   at System.CommandLine.NamingConventionBinder.ModelBindingCommandHandler.<InvokeAsync>d__11.MoveNext() + 0x61
--- End of stack trace from previous location ---
   at System.CommandLine.Invocation.InvocationPipeline.<InvokeAsync>d__0.MoveNext() + 0x1cd
--- End of stack trace from previous location ---
   at Microsoft.AI.Foundry.Local.Program.<Main>d__1.MoveNext() + 0x4e4

C:\Users\(Hidden)\.foundry>

Too bad!


Another problem (maybe the same problem?) is I tried to show out if I can list out all the NPU-supported models But

C:\Users\(hidden)\.foundry>foundry model list --filter device=NPU
🕚 Downloading complete!...
Failed to download or register the following EPs: OpenVINOExecutionProvider. Will try installing again later.
Valid EPs: CPUExecutionProvider, WebGpuExecutionProvider
Alias                          Device     Task               File Size    License      Model ID

C:\Users\(Hidden)\.foundry>

The problem is: Failed to download or register the following EPs


I don't know if it's a problem or I did something wrong. So I didn't submit this at issue. If you know what can I do, comment please. Very THANK YOU!

Environment

CPU = Intel Core Ultra 7 155H Mem = 32GB NPU = Intel AI Boost, DriverV32.0.100.4404, DriverTime2025/10/26 GPU = Intel Arc Graphics System = Windows 11 24H2 Region=HongKong,China

I had download an OpenVINO toolkit from Intel and added lib&release&debug folders in it to Path.

My computer supports python. OpenVINO of Python3.12.8 can detect NPU correctly.

952313 avatar Nov 30 '25 06:11 952313

I downgrade to 0.7.120, it can be used correctly. I can use NPU.

952313 avatar Nov 30 '25 11:11 952313

Thank you for reporting this. Can you please share the output of foundry model list with the version of Foundry Local that is not working. Also, are you running on latest Windows, or Windows Insider Program?

natke avatar Dec 01 '25 01:12 natke

Reply

About foundry model list

C:\Users\(hidden)>foundry model list
Alias                          Device     Task               File Size    License      Model ID
-----------------------------------------------------------------------------------------------
phi-4                          GPU        chat-completion    8.37 GB      MIT          Phi-4-generic-gpu:1
                               CPU        chat-completion    10.16 GB     MIT          Phi-4-generic-cpu:1
----------------------------------------------------------------------------------------------------------
phi-3.5-mini                   GPU        chat-completion    2.16 GB      MIT          Phi-3.5-mini-instruct-generic-gpu:1
                               CPU        chat-completion    2.53 GB      MIT          Phi-3.5-mini-instruct-generic-cpu:1
--------------------------------------------------------------------------------------------------------------------------
phi-3-mini-128k                GPU        chat-completion    2.13 GB      MIT          Phi-3-mini-128k-instruct-generic-gpu:1
                               CPU        chat-completion    2.54 GB      MIT          Phi-3-mini-128k-instruct-generic-cpu:2
-----------------------------------------------------------------------------------------------------------------------------
phi-3-mini-4k                  GPU        chat-completion    2.13 GB      MIT          Phi-3-mini-4k-instruct-generic-gpu:1
                               CPU        chat-completion    2.53 GB      MIT          Phi-3-mini-4k-instruct-generic-cpu:2
---------------------------------------------------------------------------------------------------------------------------
mistral-7b-v0.2                GPU        chat-completion    4.07 GB      apache-2.0   mistralai-Mistral-7B-Instruct-v0-2-generic-gpu:1
                               CPU        chat-completion    4.07 GB      apache-2.0   mistralai-Mistral-7B-Instruct-v0-2-generic-cpu:2
---------------------------------------------------------------------------------------------------------------------------------------
deepseek-r1-14b                GPU        chat-completion    10.27 GB     MIT          deepseek-r1-distill-qwen-14b-generic-gpu:3
                               CPU        chat-completion    11.51 GB     MIT          deepseek-r1-distill-qwen-14b-generic-cpu:3
---------------------------------------------------------------------------------------------------------------------------------
deepseek-r1-7b                 GPU        chat-completion    5.58 GB      MIT          deepseek-r1-distill-qwen-7b-generic-gpu:3
                               CPU        chat-completion    6.43 GB      MIT          deepseek-r1-distill-qwen-7b-generic-cpu:3
--------------------------------------------------------------------------------------------------------------------------------
qwen2.5-coder-0.5b             GPU        chat-completion    0.52 GB      apache-2.0   qwen2.5-coder-0.5b-instruct-generic-gpu:4
                               CPU        chat-completion    0.80 GB      apache-2.0   qwen2.5-coder-0.5b-instruct-generic-cpu:4
--------------------------------------------------------------------------------------------------------------------------------
phi-4-mini-reasoning           GPU        chat-completion    3.15 GB      MIT          Phi-4-mini-reasoning-generic-gpu:3
                               CPU        chat-completion    4.52 GB      MIT          Phi-4-mini-reasoning-generic-cpu:3
-------------------------------------------------------------------------------------------------------------------------
qwen2.5-0.5b                   GPU        chat-completion    0.68 GB      apache-2.0   qwen2.5-0.5b-instruct-generic-gpu:4
                               CPU        chat-completion    0.80 GB      apache-2.0   qwen2.5-0.5b-instruct-generic-cpu:4
--------------------------------------------------------------------------------------------------------------------------
qwen2.5-1.5b                   GPU        chat-completion    1.51 GB      apache-2.0   qwen2.5-1.5b-instruct-generic-gpu:4
                               CPU        chat-completion    1.78 GB      apache-2.0   qwen2.5-1.5b-instruct-generic-cpu:4
--------------------------------------------------------------------------------------------------------------------------
qwen2.5-coder-1.5b             GPU        chat-completion    1.25 GB      apache-2.0   qwen2.5-coder-1.5b-instruct-generic-gpu:4
                               CPU        chat-completion    1.78 GB      apache-2.0   qwen2.5-coder-1.5b-instruct-generic-cpu:4
--------------------------------------------------------------------------------------------------------------------------------
phi-4-mini                     GPU        chat-completion    3.72 GB      MIT          Phi-4-mini-instruct-generic-gpu:5
                               CPU        chat-completion    4.80 GB      MIT          Phi-4-mini-instruct-generic-cpu:5
------------------------------------------------------------------------------------------------------------------------
qwen2.5-14b                    GPU        chat-completion    9.30 GB      apache-2.0   qwen2.5-14b-instruct-generic-gpu:4
                               CPU        chat-completion    11.06 GB     apache-2.0   qwen2.5-14b-instruct-generic-cpu:4
-------------------------------------------------------------------------------------------------------------------------
qwen2.5-coder-14b              GPU        chat-completion    8.79 GB      apache-2.0   qwen2.5-coder-14b-instruct-generic-gpu:4
                               CPU        chat-completion    11.06 GB     apache-2.0   qwen2.5-coder-14b-instruct-generic-cpu:4
-------------------------------------------------------------------------------------------------------------------------------
qwen2.5-coder-7b               GPU        chat-completion    4.73 GB      apache-2.0   qwen2.5-coder-7b-instruct-generic-gpu:4
                               CPU        chat-completion    6.16 GB      apache-2.0   qwen2.5-coder-7b-instruct-generic-cpu:4
------------------------------------------------------------------------------------------------------------------------------
qwen2.5-7b                     GPU        chat-completion    5.20 GB      apache-2.0   qwen2.5-7b-instruct-generic-gpu:4
                               CPU        chat-completion    6.16 GB      apache-2.0   qwen2.5-7b-instruct-generic-cpu:4

Foundry Model List

Alias Device Task File Size License Model ID
phi-4 GPU chat-completion 8.37 GB MIT Phi-4-generic-gpu:1
CPU chat-completion 10.16 GB MIT Phi-4-generic-cpu:1
phi-3.5-mini GPU chat-completion 2.16 GB MIT Phi-3.5-mini-instruct-generic-gpu:1
CPU chat-completion 2.53 GB MIT Phi-3.5-mini-instruct-generic-cpu:1
phi-3-mini-128k GPU chat-completion 2.13 GB MIT Phi-3-mini-128k-instruct-generic-gpu:1
CPU chat-completion 2.54 GB MIT Phi-3-mini-128k-instruct-generic-cpu:2
phi-3-mini-4k GPU chat-completion 2.13 GB MIT Phi-3-mini-4k-instruct-generic-gpu:1
CPU chat-completion 2.53 GB MIT Phi-3-mini-4k-instruct-generic-cpu:2
mistral-7b-v0.2 GPU chat-completion 4.07 GB apache-2.0 mistralai-Mistral-7B-Instruct-v0-2-generic-gpu:1
CPU chat-completion 4.07 GB apache-2.0 mistralai-Mistral-7B-Instruct-v0-2-generic-cpu:2
deepseek-r1-14b GPU chat-completion 10.27 GB MIT deepseek-r1-distill-qwen-14b-generic-gpu:3
CPU chat-completion 11.51 GB MIT deepseek-r1-distill-qwen-14b-generic-cpu:3
deepseek-r1-7b GPU chat-completion 5.58 GB MIT deepseek-r1-distill-qwen-7b-generic-gpu:3
CPU chat-completion 6.43 GB MIT deepseek-r1-distill-qwen-7b-generic-cpu:3
qwen2.5-coder-0.5b GPU chat-completion 0.52 GB apache-2.0 qwen2.5-coder-0.5b-instruct-generic-gpu:4
CPU chat-completion 0.80 GB apache-2.0 qwen2.5-coder-0.5b-instruct-generic-cpu:4
phi-4-mini-reasoning GPU chat-completion 3.15 GB MIT Phi-4-mini-reasoning-generic-gpu:3
CPU chat-completion 4.52 GB MIT Phi-4-mini-reasoning-generic-cpu:3
qwen2.5-0.5b GPU chat-completion 0.68 GB apache-2.0 qwen2.5-0.5b-instruct-generic-gpu:4
CPU chat-completion 0.80 GB apache-2.0 qwen2.5-0.5b-instruct-generic-cpu:4
qwen2.5-1.5b GPU chat-completion 1.51 GB apache-2.0 qwen2.5-1.5b-instruct-generic-gpu:4
CPU chat-completion 1.78 GB apache-2.0 qwen2.5-1.5b-instruct-generic-cpu:4
qwen2.5-coder-1.5b GPU chat-completion 1.25 GB apache-2.0 qwen2.5-coder-1.5b-instruct-generic-gpu:4
CPU chat-completion 1.78 GB apache-2.0 qwen2.5-coder-1.5b-instruct-generic-cpu:4
phi-4-mini GPU chat-completion 3.72 GB MIT Phi-4-mini-instruct-generic-gpu:5
CPU chat-completion 4.80 GB MIT Phi-4-mini-instruct-generic-cpu:5
qwen2.5-14b GPU chat-completion 9.30 GB apache-2.0 qwen2.5-14b-instruct-generic-gpu:4
CPU chat-completion 11.06 GB apache-2.0 qwen2.5-14b-instruct-generic-cpu:4
qwen2.5-coder-14b GPU chat-completion 8.79 GB apache-2.0 qwen2.5-coder-14b-instruct-generic-gpu:4
CPU chat-completion 11.06 GB apache-2.0 qwen2.5-coder-14b-instruct-generic-cpu:4
qwen2.5-coder-7b GPU chat-completion 4.73 GB apache-2.0 qwen2.5-coder-7b-instruct-generic-gpu:4
CPU chat-completion 6.16 GB apache-2.0 qwen2.5-coder-7b-instruct-generic-cpu:4
qwen2.5-7b GPU chat-completion 5.20 GB apache-2.0 qwen2.5-7b-instruct-generic-gpu:4
CPU chat-completion 6.16 GB apache-2.0 qwen2.5-7b-instruct-generic-cpu:4

======

My system

I'm not using Windows Insider Program

CPU = Intel Core Ultra 7 155H Mem = 32GB NPU = Intel AI Boost, DriverV32.0.100.4404, DriverTime2025/10/26 GPU = Intel Arc Graphics ${\color{red}{System = Windows 11-24H2}}$

952313 avatar Dec 01 '25 11:12 952313