vscode-ai-toolkit icon indicating copy to clipboard operation
vscode-ai-toolkit copied to clipboard

Support for Copilot+ PCs

Open pkbullock opened this issue 1 year ago • 15 comments

It would be great to see if AI Toolkit can leverage the NPU in Copilot PCs. Currently this uses the CPU, its nice a quick on the Snapdragon processors but not using the AI processor when running models.

pkbullock avatar Sep 02 '24 21:09 pkbullock

I wonder if this is related to onnxruntime-genai still awaiting QNN support.

sirredbeard avatar Sep 05 '24 15:09 sirredbeard

This is listed in the docs as supports AI Copilot PC but it doesnt, my NPU activity is 0%. So how to use this?

pkbullock avatar Sep 10 '24 20:09 pkbullock

I don't see any reference yet to CoPilot+ PC in the AI Toolkit docs, at least not here. Because it relies on onnxruntime-genai, I believe QNN support must land there first before AI Toolkit can take full advantage of it. You might be able to take some advantage of the NPU now, indirectly, by using DirectML with a model like Phi-3-mini-4k-directml-int4-awq-block-128-onnx which is optimized for that. I have been using DirectML on my non-CoPilot Qualcomm-based WDK23 to speed up training.

sirredbeard avatar Sep 11 '24 16:09 sirredbeard

Hi @sirredbeard - I saw it in the release notes on installation of the VSCode extension with the mention of support. But I agree seems many frameworks are dependent on the QNN runtimes/sdks being release. image

pkbullock avatar Sep 12 '24 08:09 pkbullock

It seems like direct-ml models don't show up in the model catalog on my PC that has a QC NPU

wmmc88 avatar Sep 16 '24 23:09 wmmc88

Me neither - what is the course of action to enable models to show up on Snapdragon machines ?

rockcat avatar Oct 24 '24 10:10 rockcat

https://blogs.windows.com/windowsdeveloper/2025/01/29/running-distilled-deepseek-r1-models-locally-on-copilot-pcs-powered-by-windows-copilot-runtime/

apparently, this feature is coming. soon.

HlexNC avatar Jan 31 '25 22:01 HlexNC

https://blogs.windows.com/windowsdeveloper/2025/01/29/running-distilled-deepseek-r1-models-locally-on-copilot-pcs-powered-by-windows-copilot-runtime/

apparently, this feature is coming. soon.

Looks like the last updates stats that feature already released, but I didn't find the model in the catalog.

GuyZhangZhang avatar Feb 07 '25 02:02 GuyZhangZhang

Which device? I can see and use that npu model with x elite.

xgdgsc avatar Feb 07 '25 05:02 xgdgsc

Which device? I can see and use that npu model with x elite.

Image Surface Pro with snapdragon CPU.

Not sure why don't have the local NPU option. Image

GuyZhangZhang avatar Feb 07 '25 07:02 GuyZhangZhang

At first I installed the extension to the remote ssh session. After installing it to local window it works.

xgdgsc avatar Feb 07 '25 10:02 xgdgsc

I don't think the update they have mention isn't out yet. As it is impossible to download many models.

HlexNC avatar Feb 08 '25 16:02 HlexNC

Snapdragon X elite PC with AI Tookit v0.8.6

Image

works lovely using NPU:

Image

rockcat avatar Feb 09 '25 08:02 rockcat

@rockcat - what device are you using, I have seen the cool updates and can download the mode but I cannot run it:

I am running Windows Beta, I have tried release and pre-release versions. Deleted and redownload the models. Lenovo Yoga Slim 7x

Image

Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [0] 2025-02-09T09:18:47.8111417+00:00 LoadModel model:DeepSeek-R1-Distilled-NPU-Optimized Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [0] 2025-02-09T09:18:47.8111417+00:00 LoadModel model:DeepSeek-R1-Distilled-NPU-Optimized Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1400] 2025-02-09T09:18:47.811353+00:00 Loading model:DeepSeek-R1-Distilled-NPU-Optimized Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1400] 2025-02-09T09:18:47.811353+00:00 Loading model:DeepSeek-R1-Distilled-NPU-Optimized Error: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1402] 2025-02-09T09:18:51.4987663+00:00 Failed loading model:DeepSeek-R1-Distilled-NPU-Optimized error: [Failed to load from EpContext model. qnn_backend_manager.cc:676 onnxruntime::qnn::QnnBackendManager::LoadCachedQnnContextFromBuffer Failed to create context from binary., at Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess(IntPtr) + 0x58 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.LoadModelAsync(String, CancellationToken) + 0x110 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase1.<EnsureModelLoadedAsync>d__41.MoveNext() + 0x3bc] Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1401] 2025-02-09T09:18:51.4993977+00:00 Finish loading model:DeepSeek-R1-Distilled-NPU-Optimized elapsed time:00:00:03.6880328 Error: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1402] 2025-02-09T09:18:51.4987663+00:00 Failed loading model:DeepSeek-R1-Distilled-NPU-Optimized error: [Failed to load from EpContext model. qnn_backend_manager.cc:676 onnxruntime::qnn::QnnBackendManager::LoadCachedQnnContextFromBuffer Failed to create context from binary., at Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess(IntPtr) + 0x58 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.LoadModelAsync(String, CancellationToken) + 0x110 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase1.<EnsureModelLoadedAsync>d__41.MoveNext() + 0x3bc --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x24 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase1.<EnsureModelLoadedAsync>d__41.MoveNext() + 0x6c8 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x24 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0x100 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x68 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase1.<LoadModelAsync>d__27.MoveNext() + 0x130 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x24 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0x100 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x68 at Microsoft.Neutron.OpenAI.WebApplicationFactory.<>c.<<Create>b__0_6>d.MoveNext() + 0x114] Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1401] 2025-02-09T09:18:51.4993977+00:00 Finish loading model:DeepSeek-R1-Distilled-NPU-Optimized elapsed time:00:00:03.6880328 [2025-02-09T09:18:51.506Z] [ERROR] Failed loading model DeepSeek-R1-Distilled-NPU-Optimized. Failed to load from EpContext model. qnn_backend_manager.cc:676 onnxruntime::qnn::QnnBackendManager::LoadCachedQnnContextFromBuffer Failed to create context from binary. Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [0] 2025-02-09T09:18:55.4010164+00:00 GetLoadedModels Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceLlamaSharp [0] 2025-02-09T09:18:55.4013778+00:00 GetLoadedModels Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [0] 2025-02-09T09:18:55.4010164+00:00 GetLoadedModels Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceLlamaSharp [0] 2025-02-09T09:18:55.4013778+00:00 GetLoadedModels Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [0] 2025-02-09T09:18:55.4030492+00:00 GetModels Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [0] 2025-02-09T09:18:55.4030492+00:00 GetModels Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [0] 2025-02-09T09:18:55.4060139+00:00 LoadModel model:DeepSeek-R1-Distilled-NPU-Optimized Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [0] 2025-02-09T09:18:55.4060139+00:00 LoadModel model:DeepSeek-R1-Distilled-NPU-Optimized Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1400] 2025-02-09T09:18:55.4063+00:00 Loading model:DeepSeek-R1-Distilled-NPU-Optimized Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1400] 2025-02-09T09:18:55.4063+00:00 Loading model:DeepSeek-R1-Distilled-NPU-Optimized Error: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1402] 2025-02-09T09:18:58.8581716+00:00 Failed loading model:DeepSeek-R1-Distilled-NPU-Optimized error: [Failed to load from EpContext model. qnn_backend_manager.cc:676 onnxruntime::qnn::QnnBackendManager::LoadCachedQnnContextFromBuffer Failed to create context from binary., at Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess(IntPtr) + 0x58 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.LoadModelAsync(String, CancellationToken) + 0x110 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase1.<EnsureModelLoadedAsync>d__41.MoveNext() + 0x3bc] Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1401] 2025-02-09T09:18:58.8587846+00:00 Finish loading model:DeepSeek-R1-Distilled-NPU-Optimized elapsed time:00:00:03.4524796 Error: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1402] 2025-02-09T09:18:58.8581716+00:00 Failed loading model:DeepSeek-R1-Distilled-NPU-Optimized error: [Failed to load from EpContext model. qnn_backend_manager.cc:676 onnxruntime::qnn::QnnBackendManager::LoadCachedQnnContextFromBuffer Failed to create context from binary., at Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess(IntPtr) + 0x58 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.LoadModelAsync(String, CancellationToken) + 0x110 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase1.<EnsureModelLoadedAsync>d__41.MoveNext() + 0x3bc --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x24 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase1.<EnsureModelLoadedAsync>d__41.MoveNext() + 0x6c8 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x24 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0x100 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x68 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase1.<LoadModelAsync>d__27.MoveNext() + 0x130 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x24 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0x100 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x68] Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1401] 2025-02-09T09:18:58.8587846+00:00 Finish loading model:DeepSeek-R1-Distilled-NPU-Optimized elapsed time:00:00:03.4524796 [2025-02-09T09:18:58.866Z] [ERROR] Failed loading model DeepSeek-R1-Distilled-NPU-Optimized. Failed to load from EpContext model. qnn_backend_manager.cc:676 onnxruntime::qnn::QnnBackendManager::LoadCachedQnnContextFromBuffer Failed to create context from binary.

pkbullock avatar Feb 09 '25 09:02 pkbullock

Lenovo Yoga with Snapdragon X elite

On Sun, Feb 9, 2025 at 9:21 AM Paul Bullock @.***> wrote:

@rockcat https://github.com/rockcat - what device are you using, I have seen the cool updates and can download the mode but I cannot run it:

I am running Windows Beta, I have tried release and pre-release versions. Deleted and redownload the models.

image.png (view on web) https://github.com/user-attachments/assets/2d1c1ac0-7063-45a4-ad94-b056d75914f8

Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [0] 2025-02-09T09:18:47.8111417+00:00 LoadModel model:DeepSeek-R1-Distilled-NPU-Optimized Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [0] 2025-02-09T09:18:47.8111417+00:00 LoadModel model:DeepSeek-R1-Distilled-NPU-Optimized Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1400] 2025-02-09T09:18:47.811353+00:00 Loading model:DeepSeek-R1-Distilled-NPU-Optimized Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1400] 2025-02-09T09:18:47.811353+00:00 Loading model:DeepSeek-R1-Distilled-NPU-Optimized Error: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1402] 2025-02-09T09:18:51.4987663+00:00 Failed loading model:DeepSeek-R1-Distilled-NPU-Optimized error: [Failed to load from EpContext model. qnn_backend_manager.cc:676 onnxruntime::qnn::QnnBackendManager::LoadCachedQnnContextFromBuffer Failed to create context from binary., at Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess(IntPtr) + 0x58 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.LoadModelAsync(String, CancellationToken) + 0x110 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase1.<EnsureModelLoadedAsync>d__41.MoveNext()

  • 0x3bc] Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1401] 2025-02-09T09:18:51.4993977+00:00 Finish loading model:DeepSeek-R1-Distilled-NPU-Optimized elapsed time:00:00:03.6880328 Error: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1402] 2025-02-09T09:18:51.4987663+00:00 Failed loading model:DeepSeek-R1-Distilled-NPU-Optimized error: [Failed to load from EpContext model. qnn_backend_manager.cc:676 onnxruntime::qnn::QnnBackendManager::LoadCachedQnnContextFromBuffer Failed to create context from binary., at Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess(IntPtr) + 0x58 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.LoadModelAsync(String, CancellationToken) + 0x110 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase1.d__41.MoveNext()
  • 0x3bc --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x24 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase1.<EnsureModelLoadedAsync>d__41.MoveNext()
  • 0x6c8 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x24 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0x100 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x68 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase1.d__27.MoveNext()
  • 0x130 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x24 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0x100 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x68 at Microsoft.Neutron.OpenAI.WebApplicationFactory.<>c.<b__0_6>d.MoveNext()
  • 0x114] Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1401] 2025-02-09T09:18:51.4993977+00:00 Finish loading model:DeepSeek-R1-Distilled-NPU-Optimized elapsed time:00:00:03.6880328 [2025-02-09T09:18:51.506Z] [ERROR] Failed loading model DeepSeek-R1-Distilled-NPU-Optimized. Failed to load from EpContext model. qnn_backend_manager.cc:676 onnxruntime::qnn::QnnBackendManager::LoadCachedQnnContextFromBuffer Failed to create context from binary. Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [0] 2025-02-09T09:18:55.4010164+00:00 GetLoadedModels Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceLlamaSharp [0] 2025-02-09T09:18:55.4013778+00:00 GetLoadedModels Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [0] 2025-02-09T09:18:55.4010164+00:00 GetLoadedModels Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceLlamaSharp [0] 2025-02-09T09:18:55.4013778+00:00 GetLoadedModels Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [0] 2025-02-09T09:18:55.4030492+00:00 GetModels Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [0] 2025-02-09T09:18:55.4030492+00:00 GetModels Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [0] 2025-02-09T09:18:55.4060139+00:00 LoadModel model:DeepSeek-R1-Distilled-NPU-Optimized Debug: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [0] 2025-02-09T09:18:55.4060139+00:00 LoadModel model:DeepSeek-R1-Distilled-NPU-Optimized Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1400] 2025-02-09T09:18:55.4063+00:00 Loading model:DeepSeek-R1-Distilled-NPU-Optimized Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1400] 2025-02-09T09:18:55.4063+00:00 Loading model:DeepSeek-R1-Distilled-NPU-Optimized Error: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1402] 2025-02-09T09:18:58.8581716+00:00 Failed loading model:DeepSeek-R1-Distilled-NPU-Optimized error: [Failed to load from EpContext model. qnn_backend_manager.cc:676 onnxruntime::qnn::QnnBackendManager::LoadCachedQnnContextFromBuffer Failed to create context from binary., at Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess(IntPtr) + 0x58 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.LoadModelAsync(String, CancellationToken) + 0x110 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase1.<EnsureModelLoadedAsync>d__41.MoveNext()
  • 0x3bc] Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1401] 2025-02-09T09:18:58.8587846+00:00 Finish loading model:DeepSeek-R1-Distilled-NPU-Optimized elapsed time:00:00:03.4524796 Error: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1402] 2025-02-09T09:18:58.8581716+00:00 Failed loading model:DeepSeek-R1-Distilled-NPU-Optimized error: [Failed to load from EpContext model. qnn_backend_manager.cc:676 onnxruntime::qnn::QnnBackendManager::LoadCachedQnnContextFromBuffer Failed to create context from binary., at Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess(IntPtr) + 0x58 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.LoadModelAsync(String, CancellationToken) + 0x110 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase1.d__41.MoveNext()
  • 0x3bc --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x24 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase1.<EnsureModelLoadedAsync>d__41.MoveNext()
  • 0x6c8 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x24 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0x100 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x68 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase1.d__27.MoveNext()
  • 0x130 --- End of stack trace from previous location --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x24 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0x100 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x68] Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1401] 2025-02-09T09:18:58.8587846+00:00 Finish loading model:DeepSeek-R1-Distilled-NPU-Optimized elapsed time:00:00:03.4524796 [2025-02-09T09:18:58.866Z] [ERROR] Failed loading model DeepSeek-R1-Distilled-NPU-Optimized. Failed to load from EpContext model. qnn_backend_manager.cc:676 onnxruntime::qnn::QnnBackendManager::LoadCachedQnnContextFromBuffer Failed to create context from binary.

— Reply to this email directly, view it on GitHub https://github.com/microsoft/vscode-ai-toolkit/issues/92#issuecomment-2646142243, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARKP6YTU5DVEF7X3CC6HKD2O4MZ5AVCNFSM6AAAAABNQ4L6GGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNBWGE2DEMRUGM . You are receiving this because you were mentioned.Message ID: @.***>

rockcat avatar Feb 09 '25 15:02 rockcat