ultralytics icon indicating copy to clipboard operation
ultralytics copied to clipboard

Passing Custom Config to OpenVINO Compilation

Open Prabindas001 opened this issue 2 months ago • 8 comments

Search before asking

  • [x] I have searched the Ultralytics YOLO issues and discussions and found no similar questions.

Question

Hey team 👋,

I have a use case where I need to limit the number of CPU threads used during inference — for example, if the system has 10 threads, I want OpenVINO to use no more than 7.

I’d also like to take advantage of OpenVINO’s runtime compilation optimizations, but still keep the simplicity of:

model = YOLO("model_openvino/")

Currently, I don’t see a way to pass OpenVINO runtime options like inference_num_threads or PERFORMANCE_HINTS through the YOLO API to the init function of the AutoBackend class.

While I can manually compile and use the model with openVINO runtime, but doing so means I have to manually handle a lot of things including confidence thresholds, IoU filtering, strides, etc. which adds a lot of overhead.

I couldn’t find any documentation or examples in Ultralytics explaining how to achieve this. If there’s a recommended approach or workaround, please let me know.

Thanks for the great work and continued improvements integrations!

Additional

No response

Prabindas001 avatar Oct 06 '25 08:10 Prabindas001

👋 Hello @Prabindas001, thank you for your interest in Ultralytics 🚀! This is an automated response and an Ultralytics engineer will follow up with you shortly. In the meantime, we recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

For your OpenVINO-specific inference question, please also share: -Exact code snippet you’re using to load and run YOLO("model_openvino/") -The OpenVINO device you’re targeting (e.g., CPU) and desired runtime options (e.g., inference_num_threads, PERFORMANCE_HINTS) -Environment details (OS, Python, PyTorch, OpenVINO versions) and whether you’ve tested the latest ultralytics release -Any logs or warnings observed during initialization or inference

Join the Ultralytics community where it suits you best. For real-time chat, head to Discord 🎧. Prefer in-depth discussions? Check out Discourse. Or dive into threads on our Subreddit to share knowledge with the community.

Upgrade

Upgrade to the latest ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8 to verify your issue is not already resolved in the latest version:

pip install -U ultralytics

Environments

YOLO may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLO Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

UltralyticsAssistant avatar Oct 06 '25 08:10 UltralyticsAssistant

  • I have exported the .pt model into openVINO.
  • I am trying to run it on a CPU, i want to set the Inference_num_threads dynamically during startup. Along with this, i want to be able to configure the performance hints during startup.
  • Environment: Mac M4, Python 3.12, Torch 2.8.0, openVINO 2025.3.0, Ultralytics 8.3.205.

Prabindas001 avatar Oct 06 '25 13:10 Prabindas001

Thanks for the detailed request — you’re right, the current YOLO("..._openvino_model/") path doesn’t expose OpenVINO compile properties yet; we’ve marked this as an enhancement to add a small ov_config passthrough to AutoBackend so you can set things like inference_num_threads and performance hints without leaving the YOLO API. For now, a clean workaround is to export with NMS baked in and compile via OpenVINO with your desired config:

# 1) Export with NMS to avoid manual postprocessing
from ultralytics import YOLO
YOLO("yolo11n.pt").export(format="openvino", nms=True)

# 2) Compile with custom threads + performance hint
import openvino as ov
from openvino import properties

core = ov.Core()
model = core.read_model("yolo11n_openvino_model/yolo11n.xml")
cfg = {
    properties.inference_num_threads(): 7,
    properties.hint.performance_mode(): properties.hint.PerformanceMode.LATENCY,
}
compiled = core.compile_model(model, "CPU", cfg)

Docs if helpful: OpenVINO export args (see nms) in the Ultralytics OpenVINO export docs and OpenVINO properties/hints in our performance hints guide.

glenn-jocher avatar Oct 06 '25 15:10 glenn-jocher

Hey Glenn,

Thanks for the response.

But as i can see, when we compile using ov core, i will not be able to load the compiled model into YOLO method. So, i loose on the simplicity of predict() configurations like vid_stride, conf, iou, etc?

Also, i was able to use a workaround, which doesn't seems optimized, but just seems to work.

I would really appreciate if you can give it a look and tell me if this is the best workaround that i can do, or if there is any better way to have the flexibility of compiling the openVINO model during startup and then continue using YOLO API for inference.

MODEL_DIR = "od-Y11s-p2-v0.1_openvino_model"
MODEL_XML = f"{MODEL_DIR}/od-Y11s-p2-v0.1.xml"
DEVICE_SELECTION = "AUTO"
OV_CONFIG = {
    properties.inference_num_threads(): 3,
    properties.hint.performance_mode(): properties.hint.PerformanceMode.THROUGHPUT,
    properties.hint.num_requests(): 1
}

def load_yolo_model(model_dir: str) -> YOLO:
    print("🚀 Loading YOLO model...")
    return YOLO(model_dir, task="detect", verbose=True)

# Running warmup inference, to initialise YOLO backend
def warmup_predictor(model: YOLO, test_image="1.jpg"):
    _ = model(test_image, imgsz=INFER_PARAMS["imgsz"],
              conf=INFER_PARAMS["conf"],
              iou=INFER_PARAMS["iou"],
              device="CPU",
              show=False, verbose=False)
    print("Predictor initialized successfully.")
    return model.predictor.model

def compile_ov_model(xml_path: str, config: dict):
    print(f" Compiling OpenVINO model: {xml_path}")
    core = ov.Core()
    ov_model = core.read_model(xml_path)
    compiled = core.compile_model(ov_model, DEVICE_SELECTION, config)
    print("OpenVINO model compiled successfully.")
    return core, ov_model, compiled

def inject_compiled_backend(model: YOLO, ov_model, compiled_model):
    m = model.predictor.model
    print(f"Old backend id: {id(m.ov_compiled_model)}")

    m.device_name = DEVICE_SELECTION
    m.ov_model = ov_model
    m.ov_compiled_model = compiled_model
    model.predictor.model = m

    print(f"New backend id: {id(m.ov_compiled_model)}")

def verify_runtime_config(compiled_model):
    props = compiled_model.get_property("SUPPORTED_PROPERTIES")
    print("\n Runtime Configuration:")
    for key in ["INFERENCE_NUM_THREADS", "NUM_STREAMS", "PERFORMANCE_HINT", "EXECUTION_DEVICES"]:
        if key in props:
            try:
                print(f"{key}: {compiled_model.get_property(key)}")
            except Exception:
                pass

if __name__ == "__main__":
    model = load_yolo_model(MODEL_DIR)
    m = warmup_predictor(model)
    core, ov_model, compiled = compile_ov_model(MODEL_XML, OV_CONFIG)
    inject_compiled_backend(model, ov_model, compiled)
    verify_runtime_config(compiled)

Prabindas001 avatar Oct 07 '25 06:10 Prabindas001

Train YOLO11n on COCO8 for 3 epochs

!yolo train model=yolo11n.pt data=/content/Linh-Kien-1/data.yaml epochs=3 imgsz=640 Ultralytics 8.3.205 🚀 Python-3.12.11 torch-2.8.0+cu126 CUDA:0 (Tesla T4, 15095MiB) engine/trainer: agnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, compile=False, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=/content/Linh-Kien-1/data.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=3, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolo11n.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=train3, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0, plots=True, pose=12.0, pretrained=True, profile=False, project=None, rect=False, resume=False, retina_masks=False, save=True, save_conf=False, save_crop=False, save_dir=/content/runs/detect/train3, save_frames=False, save_json=False, save_period=-1, save_txt=False, scale=0.5, seed=0, shear=0.0, show=False, show_boxes=True, show_conf=True, show_labels=True, simplify=True, single_cls=False, source=None, split=val, stream_buffer=False, task=detect, time=None, tracker=botsort.yaml, translate=0.1, val=True, verbose=True, vid_stride=1, visualize=False, warmup_bias_lr=0.1, warmup_epochs=3.0, warmup_momentum=0.8, weight_decay=0.0005, workers=8, workspace=None

Traceback (most recent call last): File "/usr/local/lib/python3.12/dist-packages/ultralytics/engine/trainer.py", line 634, in get_dataset data = check_det_dataset(self.args.data) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/ultralytics/data/utils.py", line 467, in check_det_dataset raise FileNotFoundError(m) FileNotFoundError: Dataset '/content/Linh-Kien-1/data.yaml' images not found, missing path '/content/Linh-Kien-1/valid/images' Note dataset download directory is '/content/datasets'. You can update this in '/root/.config/Ultralytics/settings.json'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/local/bin/yolo", line 10, in sys.exit(entrypoint()) ^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/ultralytics/cfg/init.py", line 990, in entrypoint getattr(model, mode)(**overrides) # default args from model ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/ultralytics/engine/model.py", line 795, in train self.trainer = (trainer or self._smart_load("trainer"))(overrides=args, _callbacks=self.callbacks) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/ultralytics/models/yolo/detect/train.py", line 65, in init super().init(cfg, overrides, _callbacks) File "/usr/local/lib/python3.12/dist-packages/ultralytics/engine/trainer.py", line 158, in init self.data = self.get_dataset() ^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/ultralytics/engine/trainer.py", line 638, in get_dataset raise RuntimeError(emojis(f"Dataset '{clean_url(self.args.data)}' error ❌ {e}")) from e RuntimeError: Dataset '/content/Linh-Kien-1/data.yaml' error ❌ Dataset '/content/Linh-Kien-1/data.yaml' images not found, missing path '/content/Linh-Kien-1/valid/images' Note dataset download directory is '/content/datasets'. You can update this in '/root/.config/Ultralytics/settings.json'

can you help me with this error!

buimanhiep2962005-cell avatar Oct 07 '25 10:10 buimanhiep2962005-cell

@buimanhiep2962005-cell thanks for the report—this is unrelated to OpenVINO compile options, so please open a new issue for visibility and include your data.yaml, a short directory tree, and yolo checks output. Quick fix: the error shows Ultralytics is looking for /content/Linh-Kien-1/valid/images; ensure your YAML uses val and that the paths actually exist, e.g.:

# /content/Linh-Kien-1/data.yaml
path: /content/Linh-Kien-1
train: images/train
val: images/val
names: [class0, class1]

Expected folders:

/content/Linh-Kien-1/
  images/train ...  labels/train ...
  images/val   ...  labels/val   ...

Dataset format details are in the dataset guide; if you’re in Colab/Notebook, you can paste text outputs and run !yolo checks to attach system diagnostics.

glenn-jocher avatar Oct 07 '25 19:10 glenn-jocher

Hey Glenn,

The comment by @buimanhiep2962005-cell was unrelated to this issue, and i think due to that, you missed my message for which i had created this issue. Request you to please check and give your valuable advise.

Thanks

Prabindas001 avatar Oct 08 '25 07:10 Prabindas001

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

  • Docs: https://docs.ultralytics.com
  • HUB: https://hub.ultralytics.com
  • Community: https://community.ultralytics.com

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

github-actions[bot] avatar Nov 08 '25 00:11 github-actions[bot]