Passing Custom Config to OpenVINO Compilation
Search before asking
- [x] I have searched the Ultralytics YOLO issues and discussions and found no similar questions.
Question
Hey team 👋,
I have a use case where I need to limit the number of CPU threads used during inference — for example, if the system has 10 threads, I want OpenVINO to use no more than 7.
I’d also like to take advantage of OpenVINO’s runtime compilation optimizations, but still keep the simplicity of:
model = YOLO("model_openvino/")
Currently, I don’t see a way to pass OpenVINO runtime options like inference_num_threads or PERFORMANCE_HINTS through the YOLO API to the init function of the AutoBackend class.
While I can manually compile and use the model with openVINO runtime, but doing so means I have to manually handle a lot of things including confidence thresholds, IoU filtering, strides, etc. which adds a lot of overhead.
I couldn’t find any documentation or examples in Ultralytics explaining how to achieve this. If there’s a recommended approach or workaround, please let me know.
Thanks for the great work and continued improvements integrations!
Additional
No response
👋 Hello @Prabindas001, thank you for your interest in Ultralytics 🚀! This is an automated response and an Ultralytics engineer will follow up with you shortly. In the meantime, we recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.
If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.
If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.
For your OpenVINO-specific inference question, please also share:
-Exact code snippet you’re using to load and run YOLO("model_openvino/")
-The OpenVINO device you’re targeting (e.g., CPU) and desired runtime options (e.g., inference_num_threads, PERFORMANCE_HINTS)
-Environment details (OS, Python, PyTorch, OpenVINO versions) and whether you’ve tested the latest ultralytics release
-Any logs or warnings observed during initialization or inference
Join the Ultralytics community where it suits you best. For real-time chat, head to Discord 🎧. Prefer in-depth discussions? Check out Discourse. Or dive into threads on our Subreddit to share knowledge with the community.
Upgrade
Upgrade to the latest ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8 to verify your issue is not already resolved in the latest version:
pip install -U ultralytics
Environments
YOLO may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
- Notebooks with free GPU:
- Google Cloud Deep Learning VM. See GCP Quickstart Guide
- Amazon Deep Learning AMI. See AWS Quickstart Guide
- Docker Image. See Docker Quickstart Guide
Status
If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLO Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.
- I have exported the .pt model into openVINO.
- I am trying to run it on a CPU, i want to set the Inference_num_threads dynamically during startup. Along with this, i want to be able to configure the performance hints during startup.
- Environment: Mac M4, Python 3.12, Torch 2.8.0, openVINO 2025.3.0, Ultralytics 8.3.205.
Thanks for the detailed request — you’re right, the current YOLO("..._openvino_model/") path doesn’t expose OpenVINO compile properties yet; we’ve marked this as an enhancement to add a small ov_config passthrough to AutoBackend so you can set things like inference_num_threads and performance hints without leaving the YOLO API. For now, a clean workaround is to export with NMS baked in and compile via OpenVINO with your desired config:
# 1) Export with NMS to avoid manual postprocessing
from ultralytics import YOLO
YOLO("yolo11n.pt").export(format="openvino", nms=True)
# 2) Compile with custom threads + performance hint
import openvino as ov
from openvino import properties
core = ov.Core()
model = core.read_model("yolo11n_openvino_model/yolo11n.xml")
cfg = {
properties.inference_num_threads(): 7,
properties.hint.performance_mode(): properties.hint.PerformanceMode.LATENCY,
}
compiled = core.compile_model(model, "CPU", cfg)
Docs if helpful: OpenVINO export args (see nms) in the Ultralytics OpenVINO export docs and OpenVINO properties/hints in our performance hints guide.
Hey Glenn,
Thanks for the response.
But as i can see, when we compile using ov core, i will not be able to load the compiled model into YOLO method. So, i loose on the simplicity of predict() configurations like vid_stride, conf, iou, etc?
Also, i was able to use a workaround, which doesn't seems optimized, but just seems to work.
I would really appreciate if you can give it a look and tell me if this is the best workaround that i can do, or if there is any better way to have the flexibility of compiling the openVINO model during startup and then continue using YOLO API for inference.
MODEL_DIR = "od-Y11s-p2-v0.1_openvino_model"
MODEL_XML = f"{MODEL_DIR}/od-Y11s-p2-v0.1.xml"
DEVICE_SELECTION = "AUTO"
OV_CONFIG = {
properties.inference_num_threads(): 3,
properties.hint.performance_mode(): properties.hint.PerformanceMode.THROUGHPUT,
properties.hint.num_requests(): 1
}
def load_yolo_model(model_dir: str) -> YOLO:
print("🚀 Loading YOLO model...")
return YOLO(model_dir, task="detect", verbose=True)
# Running warmup inference, to initialise YOLO backend
def warmup_predictor(model: YOLO, test_image="1.jpg"):
_ = model(test_image, imgsz=INFER_PARAMS["imgsz"],
conf=INFER_PARAMS["conf"],
iou=INFER_PARAMS["iou"],
device="CPU",
show=False, verbose=False)
print("Predictor initialized successfully.")
return model.predictor.model
def compile_ov_model(xml_path: str, config: dict):
print(f" Compiling OpenVINO model: {xml_path}")
core = ov.Core()
ov_model = core.read_model(xml_path)
compiled = core.compile_model(ov_model, DEVICE_SELECTION, config)
print("OpenVINO model compiled successfully.")
return core, ov_model, compiled
def inject_compiled_backend(model: YOLO, ov_model, compiled_model):
m = model.predictor.model
print(f"Old backend id: {id(m.ov_compiled_model)}")
m.device_name = DEVICE_SELECTION
m.ov_model = ov_model
m.ov_compiled_model = compiled_model
model.predictor.model = m
print(f"New backend id: {id(m.ov_compiled_model)}")
def verify_runtime_config(compiled_model):
props = compiled_model.get_property("SUPPORTED_PROPERTIES")
print("\n Runtime Configuration:")
for key in ["INFERENCE_NUM_THREADS", "NUM_STREAMS", "PERFORMANCE_HINT", "EXECUTION_DEVICES"]:
if key in props:
try:
print(f"{key}: {compiled_model.get_property(key)}")
except Exception:
pass
if __name__ == "__main__":
model = load_yolo_model(MODEL_DIR)
m = warmup_predictor(model)
core, ov_model, compiled = compile_ov_model(MODEL_XML, OV_CONFIG)
inject_compiled_backend(model, ov_model, compiled)
verify_runtime_config(compiled)
Train YOLO11n on COCO8 for 3 epochs
!yolo train model=yolo11n.pt data=/content/Linh-Kien-1/data.yaml epochs=3 imgsz=640 Ultralytics 8.3.205 🚀 Python-3.12.11 torch-2.8.0+cu126 CUDA:0 (Tesla T4, 15095MiB) engine/trainer: agnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, compile=False, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=/content/Linh-Kien-1/data.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=3, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolo11n.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=train3, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0, plots=True, pose=12.0, pretrained=True, profile=False, project=None, rect=False, resume=False, retina_masks=False, save=True, save_conf=False, save_crop=False, save_dir=/content/runs/detect/train3, save_frames=False, save_json=False, save_period=-1, save_txt=False, scale=0.5, seed=0, shear=0.0, show=False, show_boxes=True, show_conf=True, show_labels=True, simplify=True, single_cls=False, source=None, split=val, stream_buffer=False, task=detect, time=None, tracker=botsort.yaml, translate=0.1, val=True, verbose=True, vid_stride=1, visualize=False, warmup_bias_lr=0.1, warmup_epochs=3.0, warmup_momentum=0.8, weight_decay=0.0005, workers=8, workspace=None
Traceback (most recent call last): File "/usr/local/lib/python3.12/dist-packages/ultralytics/engine/trainer.py", line 634, in get_dataset data = check_det_dataset(self.args.data) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/ultralytics/data/utils.py", line 467, in check_det_dataset raise FileNotFoundError(m) FileNotFoundError: Dataset '/content/Linh-Kien-1/data.yaml' images not found, missing path '/content/Linh-Kien-1/valid/images' Note dataset download directory is '/content/datasets'. You can update this in '/root/.config/Ultralytics/settings.json'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/bin/yolo", line 10, in
can you help me with this error!
@buimanhiep2962005-cell thanks for the report—this is unrelated to OpenVINO compile options, so please open a new issue for visibility and include your data.yaml, a short directory tree, and yolo checks output. Quick fix: the error shows Ultralytics is looking for /content/Linh-Kien-1/valid/images; ensure your YAML uses val and that the paths actually exist, e.g.:
# /content/Linh-Kien-1/data.yaml
path: /content/Linh-Kien-1
train: images/train
val: images/val
names: [class0, class1]
Expected folders:
/content/Linh-Kien-1/
images/train ... labels/train ...
images/val ... labels/val ...
Dataset format details are in the dataset guide; if you’re in Colab/Notebook, you can paste text outputs and run !yolo checks to attach system diagnostics.
Hey Glenn,
The comment by @buimanhiep2962005-cell was unrelated to this issue, and i think due to that, you missed my message for which i had created this issue. Request you to please check and give your valuable advise.
Thanks
👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.
For additional resources and information, please see the links below:
- Docs: https://docs.ultralytics.com
- HUB: https://hub.ultralytics.com
- Community: https://community.ultralytics.com
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLO 🚀 and Vision AI ⭐