nvda icon indicating copy to clipboard operation
nvda copied to clipboard

Integrate the Windows Machine Learning framework to run ONNX models.

Open hwf1324 opened this issue 5 months ago • 7 comments

Is your feature request related to a problem? Please describe.

Microsoft added the WinML API in Windows 10 1809 and later versions. We can use this API to run ONNX models for tasks such as OCR.

Possible advantages and disadvantages:

Advantages:

  • No need to bundle ONNXRuntime; instead, use ONNXRuntime bound to WinML to reduce the size of the installation package.
  • WinML handles various hardware calls, enabling the use of high-performance GPUs.

Disadvantages:

  • Requires writing C++ code, which may not be as flexible as ONNXRuntime.
  • Using the system-bound WinML is less reliable than using a self-bundled WinML.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

hwf1324 avatar Aug 10 '25 22:08 hwf1324

Since this API is already legacy should we be using it? It seems that we're stuck between a legacy API and an experimental one. https://learn.microsoft.com/en-us/windows/ai/new-windows-ml/overview

seanbudd avatar Aug 11 '25 01:08 seanbudd

Since this API is already legacy should we be using it?

Yes, but I believe this is the most appropriate option at present. For WinML, there are currently three options: new WinML, old WinML system integration, and NuGet package bundling. For the new WinML, on one hand, it is still in the experimental stage, and on the other hand, we still have a large number of users using Windows 10.
As for the NuGet package, unless we need to support versions from Windows 8.1 to Windows 10 1809, or the running model requires specific ONNXRuntime operators.
Currently, choosing to use the system-bundled WinML balances system requirements and package size.

hwf1324 avatar Aug 11 '25 02:08 hwf1324

We will no longer be supporting at all Win 8.1 in 2026.1. NVDA on Win 10 will likely no longer be recommend or tested in the near future, as it reaches EOL in October. I think generally, we should be okay to build for Windows 11 only, particularly for cutting edge Windows features like this. Building against an API that's already legacy seems like a maintenance nightmare, particularly with how rapidly AI related tech is developing. However, at the same time, the legacy API is a stable and frozen API to use, and we can't really ship features using the experimental API. If the pull request was created today, perhaps we would accept the legacy WinML being used, I think we need to discuss it. But I think any development needs to be easily refactored/updated to the new API as it develops.

seanbudd avatar Aug 11 '25 02:08 seanbudd

Since this API is already legacy should we be using it? It seems that we're stuck between a legacy API and an experimental one. https://learn.microsoft.com/en-us/windows/ai/new-windows-ml/overview

Update, the new WinML has been officially released.

hwf1324 avatar Sep 26 '25 03:09 hwf1324

@tianzeshi-study - is this something you could prioritise? I think it would make #19337 #19338 much easier

seanbudd avatar Dec 05 '25 04:12 seanbudd

It seems that WinML is still a very early-stage project, and using it may involve running into many unexpected issues. Integrating this framework into the project could introduce risks and uncertainties.

Regarding model inference, the current model runs in under one second on the CPU, which is far from reaching its performance limit. What we may need at this stage is a more powerful and more suitable model rather than a new runtime.

WinML’s potential advantage lies in model downloading and dynamic dependency resolution. However, using an additional abstraction layer such as WinML also brings challenges: its model download process hides many details, making it difficult to customize—for example, using a mirror or specifying a custom installation path under the user's configuration directory.

Moreover, the download endpoint for its models is confusing; it does not appear to fetch them from public sources like Hugging Face or GitHub.

Additionally, the Python documentation for the new framework is not as complete as one might expect. For example, in the model catalog configuration, I could only find C# code samples: https://learn.microsoft.com/en-us/windows/ai/new-windows-ml/model-catalog/get-started

Switching to the new framework would require more testing and careful evaluation, rather than adopting it immediately just because it is the latest technology while overlooking stability. Although the current architecture already makes it very easy to swap in other inference frameworks—such as PyTorch or TensorFlow—this transition still should not be taken lightly.

tianzeshi-study avatar Dec 07 '25 12:12 tianzeshi-study

@tianzeshi-study the fact that WinML is still early stages is fine, it's going to be the future of Windows AI, and we are going to move to it at some point anyway. The feature we are providing is also early stages.

The current model might run fine under python ONNX, but we want to integrate more resource intensive models, and we don't want numpy as a dependency in NVDA.

seanbudd avatar Dec 08 '25 00:12 seanbudd