onnxruntime OpenVino seems to ignore the execution provider session

Describe the issue

Hi,

I'm using WindowsML + OpenVino provider on a Lunar Lake laptop,

It seems that the OpenVINO provider ignore the session options.

Here is a pseudo code for the workflow I'm implementing:

  auto env        = std::make_shared<Ort::Env>(ORT_LOGGING_LEVEL_VERBOSE, "OnnxInterface");
  winrt::Microsoft::Windows::AI::MachineLearning::ExecutionProviderCatalog catalog =
      winrt::Microsoft::Windows::AI::MachineLearning::ExecutionProviderCatalog::GetDefault();
  catalog.EnsureAndRegisterCertifiedAsync().get();

  // Populate execution provider devices
  std::unordered_map<std::string, std::vector<Ort::ConstEpDevice>> epDevices;
  for (const auto& ep : env->GetEpDevices())
  {
    DxO::FmtLogInfo("Available EP: {} - Device Name: {}", ep.EpName(), ep.Device().Vendor());
    epDevices[ep.EpName()].push_back(ep);
  }

  Ort::SessionOptions sessionOptions;
  sessionOptions.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_DISABLE_ALL);
  std::unordered_map<std::string, std::string> ep_options;
  ep_options["precision"]            = "FP32";
  ep_options["device_type"] = "GPU";
  sessionOptions.AppendExecutionProvider_V2(*m_env, {epDevices["OpenVINOExecutionProvider"]}, ep_options);

// Create session
auto session = std::make_unique<Ort::Session>(*env, modelInMemory.data(), modelInMemory.size(), sessionOptions);

Even if I put "GPU" as device_type, the NPU get selected. If I want to force the GPU, I have to filter the "epDevices["OpenVINOExecutionProvider"]" list
On a network with mixed FP16/FP32 precision, and if I filter the list to get only GPU, if I use the precision "ACCURACY" or "FP32", the result of the inference is wrong. If I only filter the list to get CPU, I get proper result. I'm expecting GPU FP32 and CPU to be close in term of result ; note that this network works properly with DirectML on the same hardware, and it also work with GPUs from other vendor with DirectML and their respective execution provider. I'm suspecting the provider is not honoring the "precision" mode and just use FP16 (which is know, for this network, to produce NaN values)

To reproduce

With a FP16 network,

  auto env        = std::make_shared<Ort::Env>(ORT_LOGGING_LEVEL_VERBOSE, "OnnxInterface");
  winrt::Microsoft::Windows::AI::MachineLearning::ExecutionProviderCatalog catalog =
      winrt::Microsoft::Windows::AI::MachineLearning::ExecutionProviderCatalog::GetDefault();
  catalog.EnsureAndRegisterCertifiedAsync().get();

  // Populate execution provider devices
  std::unordered_map<std::string, std::vector<Ort::ConstEpDevice>> epDevices;
  for (const auto& ep : env->GetEpDevices())
  {
    DxO::FmtLogInfo("Available EP: {} - Device Name: {}", ep.EpName(), ep.Device().Vendor());
    epDevices[ep.EpName()].push_back(ep);
  }

  Ort::SessionOptions sessionOptions;
  sessionOptions.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_DISABLE_ALL);
  std::unordered_map<std::string, std::string> ep_options;
  ep_options["precision"]            = "FP32";
  ep_options["device_type"] = "GPU";
  sessionOptions.AppendExecutionProvider_V2(*m_env, {epDevices["OpenVINOExecutionProvider"]}, ep_options);

// Create session
auto session = std::make_unique<Ort::Session>(*env, modelInMemory.data(), modelInMemory.size(), sessionOptions);

then bind+ run inference, and see if the result is close to the FP32 expected result or not, and if GPU was used or if it's the NPU.

Urgency

We manually select CPU for this network but the performance is obviously bad, we think it's an urgent issue

Platform

Windows

OS Version

26100.6584

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.23

ONNX Runtime API

WinML

Architecture

X64

Execution Provider

OpenVINO

Execution Provider Library Version

1.8.15.0 (KB5067990)

Oct 13 '25 10:10 vlejeune-dxo

Doc was updated, and the load_config option works

Nov 24 '25 09:11 vlejeune-dxo

The appropriate way to set the device type with WinML+OV is outlined in this function:

static std::pair<std::vectorOrt::ConstEpDevice, std::vectorstd::string> ConfigureOVEPDevices( Ort::Env& env, Ort::SessionOptions& session_options, const Arguments& args, std::unordered_map<std::string, std::string>& ov_options) {

std::cout << "Configuring OpenVINO Plugin EP with device type: " << args.device_type << std::endl;

std::string meta_prefix;
std::vector<std::string> ov_device_types;

auto device_type_it = ov_options.find("device_type");
auto device_type = device_type_it != ov_options.end() ? device_type_it->second : "CPU";
ov_options.erase("device_type");

std::string remainder;
size_t colon_pos = device_type.find(':');
if (colon_pos != std::string::npos) {
    meta_prefix = device_type.substr(0, colon_pos);
    remainder = device_type.substr(colon_pos + 1);
} else {
    remainder = device_type;
}

size_t start = 0;
while (start < remainder.size()) {
    size_t end = remainder.find(',', start);
    if (end == std::string::npos) end = remainder.size();
    if (end > start) {
        ov_device_types.emplace_back(remainder.substr(start, end - start));
    }
    start = end + 1;
}

**auto ep_devices = env.GetEpDevices();**
const auto get_ep_device = [&ep_devices](const std::string ep_name, const std::string& ov_device) -> Ort::ConstEpDevice {
    for (Ort::ConstEpDevice& device : ep_devices) {
        if (std::string_view(device.EpName()).find(ep_name) != std::string::npos) {
            **const auto& meta_kv = device.EpMetadata().GetKeyValuePairs();**
            auto device_type_it = meta_kv.find("ov_device");
            if (device_type_it != meta_kv.end()) {
                if (device_type_it->second == ov_device) {
                    return device;
                }
            }
        }
    }
    return Ort::ConstEpDevice{};
};

std::string ep_name = "OpenVINOExecutionProvider";
if (!meta_prefix.empty()) {
    ep_name += "." + meta_prefix;
}

**std::vector<Ort::ConstEpDevice> session_ep_devices;**
for (auto& ov_device : ov_device_types) {
    Ort::ConstEpDevice plugin_ep_device{};
    **plugin_ep_device = get_ep_device(ep_name, ov_device);**

    if (!plugin_ep_device) {
        size_t dot_pos = ov_device.find('.');
        if (dot_pos != std::string::npos) {
            ov_device.erase(dot_pos);
        }
        plugin_ep_device = get_ep_device(ep_name, ov_device);

        if (!plugin_ep_device) {
          throw std::runtime_error("Did not find an EP device with ep_name = " + ep_name + " & ov_device = " + ov_device);
        }
    }        session_ep_devices.push_back(plugin_ep_device);
}

return {session_ep_devices, ov_device_types};

}

Get the EP devices from the env, get device information from the metadata, and add the available devices to a vector.

Dec 11 '25 06:12 n1harika