OpenVino seems to ignore the execution provider session
Describe the issue
Hi,
I'm using WindowsML + OpenVino provider on a Lunar Lake laptop,
It seems that the OpenVINO provider ignore the session options.
Here is a pseudo code for the workflow I'm implementing:
auto env = std::make_shared<Ort::Env>(ORT_LOGGING_LEVEL_VERBOSE, "OnnxInterface");
winrt::Microsoft::Windows::AI::MachineLearning::ExecutionProviderCatalog catalog =
winrt::Microsoft::Windows::AI::MachineLearning::ExecutionProviderCatalog::GetDefault();
catalog.EnsureAndRegisterCertifiedAsync().get();
// Populate execution provider devices
std::unordered_map<std::string, std::vector<Ort::ConstEpDevice>> epDevices;
for (const auto& ep : env->GetEpDevices())
{
DxO::FmtLogInfo("Available EP: {} - Device Name: {}", ep.EpName(), ep.Device().Vendor());
epDevices[ep.EpName()].push_back(ep);
}
Ort::SessionOptions sessionOptions;
sessionOptions.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_DISABLE_ALL);
std::unordered_map<std::string, std::string> ep_options;
ep_options["precision"] = "FP32";
ep_options["device_type"] = "GPU";
sessionOptions.AppendExecutionProvider_V2(*m_env, {epDevices["OpenVINOExecutionProvider"]}, ep_options);
// Create session
auto session = std::make_unique<Ort::Session>(*env, modelInMemory.data(), modelInMemory.size(), sessionOptions);
- Even if I put "GPU" as device_type, the NPU get selected. If I want to force the GPU, I have to filter the "epDevices["OpenVINOExecutionProvider"]" list
- On a network with mixed FP16/FP32 precision, and if I filter the list to get only GPU, if I use the precision "ACCURACY" or "FP32", the result of the inference is wrong. If I only filter the list to get CPU, I get proper result. I'm expecting GPU FP32 and CPU to be close in term of result ; note that this network works properly with DirectML on the same hardware, and it also work with GPUs from other vendor with DirectML and their respective execution provider. I'm suspecting the provider is not honoring the "precision" mode and just use FP16 (which is know, for this network, to produce NaN values)
To reproduce
With a FP16 network,
auto env = std::make_shared<Ort::Env>(ORT_LOGGING_LEVEL_VERBOSE, "OnnxInterface");
winrt::Microsoft::Windows::AI::MachineLearning::ExecutionProviderCatalog catalog =
winrt::Microsoft::Windows::AI::MachineLearning::ExecutionProviderCatalog::GetDefault();
catalog.EnsureAndRegisterCertifiedAsync().get();
// Populate execution provider devices
std::unordered_map<std::string, std::vector<Ort::ConstEpDevice>> epDevices;
for (const auto& ep : env->GetEpDevices())
{
DxO::FmtLogInfo("Available EP: {} - Device Name: {}", ep.EpName(), ep.Device().Vendor());
epDevices[ep.EpName()].push_back(ep);
}
Ort::SessionOptions sessionOptions;
sessionOptions.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_DISABLE_ALL);
std::unordered_map<std::string, std::string> ep_options;
ep_options["precision"] = "FP32";
ep_options["device_type"] = "GPU";
sessionOptions.AppendExecutionProvider_V2(*m_env, {epDevices["OpenVINOExecutionProvider"]}, ep_options);
// Create session
auto session = std::make_unique<Ort::Session>(*env, modelInMemory.data(), modelInMemory.size(), sessionOptions);
then bind+ run inference, and see if the result is close to the FP32 expected result or not, and if GPU was used or if it's the NPU.
Urgency
We manually select CPU for this network but the performance is obviously bad, we think it's an urgent issue
Platform
Windows
OS Version
26100.6584
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.23
ONNX Runtime API
WinML
Architecture
X64
Execution Provider
OpenVINO
Execution Provider Library Version
1.8.15.0 (KB5067990)
Doc was updated, and the load_config option works
The appropriate way to set the device type with WinML+OV is outlined in this function:
static std::pair<std::vectorOrt::ConstEpDevice, std::vectorstd::string> ConfigureOVEPDevices( Ort::Env& env, Ort::SessionOptions& session_options, const Arguments& args, std::unordered_map<std::string, std::string>& ov_options) {
std::cout << "Configuring OpenVINO Plugin EP with device type: " << args.device_type << std::endl;
std::string meta_prefix;
std::vector<std::string> ov_device_types;
auto device_type_it = ov_options.find("device_type");
auto device_type = device_type_it != ov_options.end() ? device_type_it->second : "CPU";
ov_options.erase("device_type");
std::string remainder;
size_t colon_pos = device_type.find(':');
if (colon_pos != std::string::npos) {
meta_prefix = device_type.substr(0, colon_pos);
remainder = device_type.substr(colon_pos + 1);
} else {
remainder = device_type;
}
size_t start = 0;
while (start < remainder.size()) {
size_t end = remainder.find(',', start);
if (end == std::string::npos) end = remainder.size();
if (end > start) {
ov_device_types.emplace_back(remainder.substr(start, end - start));
}
start = end + 1;
}
**auto ep_devices = env.GetEpDevices();**
const auto get_ep_device = [&ep_devices](const std::string ep_name, const std::string& ov_device) -> Ort::ConstEpDevice {
for (Ort::ConstEpDevice& device : ep_devices) {
if (std::string_view(device.EpName()).find(ep_name) != std::string::npos) {
**const auto& meta_kv = device.EpMetadata().GetKeyValuePairs();**
auto device_type_it = meta_kv.find("ov_device");
if (device_type_it != meta_kv.end()) {
if (device_type_it->second == ov_device) {
return device;
}
}
}
}
return Ort::ConstEpDevice{};
};
std::string ep_name = "OpenVINOExecutionProvider";
if (!meta_prefix.empty()) {
ep_name += "." + meta_prefix;
}
**std::vector<Ort::ConstEpDevice> session_ep_devices;**
for (auto& ov_device : ov_device_types) {
Ort::ConstEpDevice plugin_ep_device{};
**plugin_ep_device = get_ep_device(ep_name, ov_device);**
if (!plugin_ep_device) {
size_t dot_pos = ov_device.find('.');
if (dot_pos != std::string::npos) {
ov_device.erase(dot_pos);
}
plugin_ep_device = get_ep_device(ep_name, ov_device);
if (!plugin_ep_device) {
throw std::runtime_error("Did not find an EP device with ep_name = " + ep_name + " & ov_device = " + ov_device);
}
} session_ep_devices.push_back(plugin_ep_device);
}
return {session_ep_devices, ov_device_types};
}
Get the EP devices from the env, get device information from the metadata, and add the available devices to a vector.