Problems on Intel NPU and Foundry Local

##Background I send this in discussion too.And i ask AI about this, AI think this is an issue, but not my fault, so I send it to issue.

Hi there! I'm trying to use Intel NPU and Foundry Local My NPU has already installed driver successfully.

foundry model run DeepSeek-R1-Distill-Qwen-7B-openvino-npu:1

I tried to run it, but

C:\Users\(hidden)\.foundry>foundry model run DeepSeek-R1-Distill-Qwen-7B-openvino-npu:1
Exception: ❌ Model <DeepSeek-R1-Distill-Qwen-7B-openvino-npu:1> was not found in the catalog or local cache.
🔍 Did you mean: `foundry model run deepseek-r1-distill-qwen-7b-generic-gpu:3`?
👉 Try `foundry model list` to see available models.
👉 Or check for typos in the model name.
[12:37:58 ERR] LogException
Microsoft.AI.Foundry.Local.Common.FLException: ❌ Model <DeepSeek-R1-Distill-Qwen-7B-openvino-npu:1> was not found in the catalog or local cache.
🔍 Did you mean: `foundry model run deepseek-r1-distill-qwen-7b-generic-gpu:3`?
👉 Try `foundry model list` to see available models.
👉 Or check for typos in the model name.
   at Microsoft.AI.Foundry.Local.Catalog.AzureFoundryCatalog.<SuggestFuzzyMatch>d__11.MoveNext() + 0xd1c
--- End of stack trace from previous location ---
   at Microsoft.AI.Foundry.Local.Common.ModelManagement.<DownloadModelAsync>d__9.MoveNext() + 0x383
--- End of stack trace from previous location ---
   at Microsoft.AI.Foundry.Local.Commands.ModelRunCommand.<<Create>b__1_0>d.MoveNext() + 0x5d6
--- End of stack trace from previous location ---
   at Microsoft.AI.Foundry.Local.Common.CommandActionFactory.<>c__DisplayClass0_0`1.<<Create>b__0>d.MoveNext() + 0x1e7
--- End of stack trace from previous location ---
   at System.CommandLine.NamingConventionBinder.CommandHandler.<GetExitCodeAsync>d__66.MoveNext() + 0x50
--- End of stack trace from previous location ---
   at System.CommandLine.NamingConventionBinder.ModelBindingCommandHandler.<InvokeAsync>d__11.MoveNext() + 0x61
--- End of stack trace from previous location ---
   at System.CommandLine.Invocation.InvocationPipeline.<InvokeAsync>d__0.MoveNext() + 0x1cd
--- End of stack trace from previous location ---
   at Microsoft.AI.Foundry.Local.Program.<Main>d__1.MoveNext() + 0x4e4

C:\Users\(Hidden)\.foundry>

Too bad!

Another problem (maybe the same problem?) is I tried to show out if I can list out all the NPU-supported models But

C:\Users\(hidden)\.foundry>foundry model list --filter device=NPU
🕚 Downloading complete!...
Failed to download or register the following EPs: OpenVINOExecutionProvider. Will try installing again later.
Valid EPs: CPUExecutionProvider, WebGpuExecutionProvider
Alias                          Device     Task               File Size    License      Model ID

C:\Users\(Hidden)\.foundry>

The problem is: Failed to download or register the following EPs

I don't know if it's a problem or I did something wrong. So I didn't submit this at issue. If you know what can I do, comment please. Very THANK YOU!

Environment

CPU = Intel Core Ultra 7 155H Mem = 32GB NPU = Intel AI Boost, DriverV32.0.100.4404, DriverTime2025/10/26 GPU = Intel Arc Graphics System = Windows 11 24H2 Region=HongKong,China

I had download an OpenVINO toolkit from Intel and added lib&release&debug folders in it to Path.

My computer supports python. OpenVINO of Python3.12.8 can detect NPU correctly.

Nov 30 '25 06:11 952313

I downgrade to 0.7.120, it can be used correctly. I can use NPU.

Nov 30 '25 11:11 952313

Thank you for reporting this. Can you please share the output of foundry model list with the version of Foundry Local that is not working. Also, are you running on latest Windows, or Windows Insider Program?

Dec 01 '25 01:12 natke

Reply

About `foundry model list`

C:\Users\(hidden)>foundry model list
Alias                          Device     Task               File Size    License      Model ID
-----------------------------------------------------------------------------------------------
phi-4                          GPU        chat-completion    8.37 GB      MIT          Phi-4-generic-gpu:1
                               CPU        chat-completion    10.16 GB     MIT          Phi-4-generic-cpu:1
----------------------------------------------------------------------------------------------------------
phi-3.5-mini                   GPU        chat-completion    2.16 GB      MIT          Phi-3.5-mini-instruct-generic-gpu:1
                               CPU        chat-completion    2.53 GB      MIT          Phi-3.5-mini-instruct-generic-cpu:1
--------------------------------------------------------------------------------------------------------------------------
phi-3-mini-128k                GPU        chat-completion    2.13 GB      MIT          Phi-3-mini-128k-instruct-generic-gpu:1
                               CPU        chat-completion    2.54 GB      MIT          Phi-3-mini-128k-instruct-generic-cpu:2
-----------------------------------------------------------------------------------------------------------------------------
phi-3-mini-4k                  GPU        chat-completion    2.13 GB      MIT          Phi-3-mini-4k-instruct-generic-gpu:1
                               CPU        chat-completion    2.53 GB      MIT          Phi-3-mini-4k-instruct-generic-cpu:2
---------------------------------------------------------------------------------------------------------------------------
mistral-7b-v0.2                GPU        chat-completion    4.07 GB      apache-2.0   mistralai-Mistral-7B-Instruct-v0-2-generic-gpu:1
                               CPU        chat-completion    4.07 GB      apache-2.0   mistralai-Mistral-7B-Instruct-v0-2-generic-cpu:2
---------------------------------------------------------------------------------------------------------------------------------------
deepseek-r1-14b                GPU        chat-completion    10.27 GB     MIT          deepseek-r1-distill-qwen-14b-generic-gpu:3
                               CPU        chat-completion    11.51 GB     MIT          deepseek-r1-distill-qwen-14b-generic-cpu:3
---------------------------------------------------------------------------------------------------------------------------------
deepseek-r1-7b                 GPU        chat-completion    5.58 GB      MIT          deepseek-r1-distill-qwen-7b-generic-gpu:3
                               CPU        chat-completion    6.43 GB      MIT          deepseek-r1-distill-qwen-7b-generic-cpu:3
--------------------------------------------------------------------------------------------------------------------------------
qwen2.5-coder-0.5b             GPU        chat-completion    0.52 GB      apache-2.0   qwen2.5-coder-0.5b-instruct-generic-gpu:4
                               CPU        chat-completion    0.80 GB      apache-2.0   qwen2.5-coder-0.5b-instruct-generic-cpu:4
--------------------------------------------------------------------------------------------------------------------------------
phi-4-mini-reasoning           GPU        chat-completion    3.15 GB      MIT          Phi-4-mini-reasoning-generic-gpu:3
                               CPU        chat-completion    4.52 GB      MIT          Phi-4-mini-reasoning-generic-cpu:3
-------------------------------------------------------------------------------------------------------------------------
qwen2.5-0.5b                   GPU        chat-completion    0.68 GB      apache-2.0   qwen2.5-0.5b-instruct-generic-gpu:4
                               CPU        chat-completion    0.80 GB      apache-2.0   qwen2.5-0.5b-instruct-generic-cpu:4
--------------------------------------------------------------------------------------------------------------------------
qwen2.5-1.5b                   GPU        chat-completion    1.51 GB      apache-2.0   qwen2.5-1.5b-instruct-generic-gpu:4
                               CPU        chat-completion    1.78 GB      apache-2.0   qwen2.5-1.5b-instruct-generic-cpu:4
--------------------------------------------------------------------------------------------------------------------------
qwen2.5-coder-1.5b             GPU        chat-completion    1.25 GB      apache-2.0   qwen2.5-coder-1.5b-instruct-generic-gpu:4
                               CPU        chat-completion    1.78 GB      apache-2.0   qwen2.5-coder-1.5b-instruct-generic-cpu:4
--------------------------------------------------------------------------------------------------------------------------------
phi-4-mini                     GPU        chat-completion    3.72 GB      MIT          Phi-4-mini-instruct-generic-gpu:5
                               CPU        chat-completion    4.80 GB      MIT          Phi-4-mini-instruct-generic-cpu:5
------------------------------------------------------------------------------------------------------------------------
qwen2.5-14b                    GPU        chat-completion    9.30 GB      apache-2.0   qwen2.5-14b-instruct-generic-gpu:4
                               CPU        chat-completion    11.06 GB     apache-2.0   qwen2.5-14b-instruct-generic-cpu:4
-------------------------------------------------------------------------------------------------------------------------
qwen2.5-coder-14b              GPU        chat-completion    8.79 GB      apache-2.0   qwen2.5-coder-14b-instruct-generic-gpu:4
                               CPU        chat-completion    11.06 GB     apache-2.0   qwen2.5-coder-14b-instruct-generic-cpu:4
-------------------------------------------------------------------------------------------------------------------------------
qwen2.5-coder-7b               GPU        chat-completion    4.73 GB      apache-2.0   qwen2.5-coder-7b-instruct-generic-gpu:4
                               CPU        chat-completion    6.16 GB      apache-2.0   qwen2.5-coder-7b-instruct-generic-cpu:4
------------------------------------------------------------------------------------------------------------------------------
qwen2.5-7b                     GPU        chat-completion    5.20 GB      apache-2.0   qwen2.5-7b-instruct-generic-gpu:4
                               CPU        chat-completion    6.16 GB      apache-2.0   qwen2.5-7b-instruct-generic-cpu:4

Foundry Model List

Alias	Device	Task	File Size	License	Model ID
phi-4	GPU	chat-completion	8.37 GB	MIT	Phi-4-generic-gpu:1
	CPU	chat-completion	10.16 GB	MIT	Phi-4-generic-cpu:1
phi-3.5-mini	GPU	chat-completion	2.16 GB	MIT	Phi-3.5-mini-instruct-generic-gpu:1
	CPU	chat-completion	2.53 GB	MIT	Phi-3.5-mini-instruct-generic-cpu:1
phi-3-mini-128k	GPU	chat-completion	2.13 GB	MIT	Phi-3-mini-128k-instruct-generic-gpu:1
	CPU	chat-completion	2.54 GB	MIT	Phi-3-mini-128k-instruct-generic-cpu:2
phi-3-mini-4k	GPU	chat-completion	2.13 GB	MIT	Phi-3-mini-4k-instruct-generic-gpu:1
	CPU	chat-completion	2.53 GB	MIT	Phi-3-mini-4k-instruct-generic-cpu:2
mistral-7b-v0.2	GPU	chat-completion	4.07 GB	apache-2.0	mistralai-Mistral-7B-Instruct-v0-2-generic-gpu:1
	CPU	chat-completion	4.07 GB	apache-2.0	mistralai-Mistral-7B-Instruct-v0-2-generic-cpu:2
deepseek-r1-14b	GPU	chat-completion	10.27 GB	MIT	deepseek-r1-distill-qwen-14b-generic-gpu:3
	CPU	chat-completion	11.51 GB	MIT	deepseek-r1-distill-qwen-14b-generic-cpu:3
deepseek-r1-7b	GPU	chat-completion	5.58 GB	MIT	deepseek-r1-distill-qwen-7b-generic-gpu:3
	CPU	chat-completion	6.43 GB	MIT	deepseek-r1-distill-qwen-7b-generic-cpu:3
qwen2.5-coder-0.5b	GPU	chat-completion	0.52 GB	apache-2.0	qwen2.5-coder-0.5b-instruct-generic-gpu:4
	CPU	chat-completion	0.80 GB	apache-2.0	qwen2.5-coder-0.5b-instruct-generic-cpu:4
phi-4-mini-reasoning	GPU	chat-completion	3.15 GB	MIT	Phi-4-mini-reasoning-generic-gpu:3
	CPU	chat-completion	4.52 GB	MIT	Phi-4-mini-reasoning-generic-cpu:3
qwen2.5-0.5b	GPU	chat-completion	0.68 GB	apache-2.0	qwen2.5-0.5b-instruct-generic-gpu:4
	CPU	chat-completion	0.80 GB	apache-2.0	qwen2.5-0.5b-instruct-generic-cpu:4
qwen2.5-1.5b	GPU	chat-completion	1.51 GB	apache-2.0	qwen2.5-1.5b-instruct-generic-gpu:4
	CPU	chat-completion	1.78 GB	apache-2.0	qwen2.5-1.5b-instruct-generic-cpu:4
qwen2.5-coder-1.5b	GPU	chat-completion	1.25 GB	apache-2.0	qwen2.5-coder-1.5b-instruct-generic-gpu:4
	CPU	chat-completion	1.78 GB	apache-2.0	qwen2.5-coder-1.5b-instruct-generic-cpu:4
phi-4-mini	GPU	chat-completion	3.72 GB	MIT	Phi-4-mini-instruct-generic-gpu:5
	CPU	chat-completion	4.80 GB	MIT	Phi-4-mini-instruct-generic-cpu:5
qwen2.5-14b	GPU	chat-completion	9.30 GB	apache-2.0	qwen2.5-14b-instruct-generic-gpu:4
	CPU	chat-completion	11.06 GB	apache-2.0	qwen2.5-14b-instruct-generic-cpu:4
qwen2.5-coder-14b	GPU	chat-completion	8.79 GB	apache-2.0	qwen2.5-coder-14b-instruct-generic-gpu:4
	CPU	chat-completion	11.06 GB	apache-2.0	qwen2.5-coder-14b-instruct-generic-cpu:4
qwen2.5-coder-7b	GPU	chat-completion	4.73 GB	apache-2.0	qwen2.5-coder-7b-instruct-generic-gpu:4
	CPU	chat-completion	6.16 GB	apache-2.0	qwen2.5-coder-7b-instruct-generic-cpu:4
qwen2.5-7b	GPU	chat-completion	5.20 GB	apache-2.0	qwen2.5-7b-instruct-generic-gpu:4
	CPU	chat-completion	6.16 GB	apache-2.0	qwen2.5-7b-instruct-generic-cpu:4

======

My system

I'm not using Windows Insider Program

CPU = Intel Core Ultra 7 155H Mem = 32GB NPU = Intel AI Boost, DriverV32.0.100.4404, DriverTime2025/10/26 GPU = Intel Arc Graphics ${\color{red}{System = Windows 11-24H2}}$

Dec 01 '25 11:12 952313

Issues on Intel NPU and OpenVINO EP

Problems on Intel NPU and Foundry Local

Environment

Reply

About foundry model list

Foundry Model List

My system

I'm not using Windows Insider Program

About `foundry model list`