yolov10 copied to clipboard
Triton warmup do not work for models with batch dimention>1
I exported YOLOv10 model to onnx format with batchsize>1:
from ultralytics import YOLOv10
yolov10_detector = YOLOv10.from_pretrained('jameslahm/yolov10x',)
yolov10_detector.export(format="onnx", dynamic=False,opset=15,batch=16,simplify=False)
and run it in tritonserver with following config:
name: "yolov10"
platform: "onnxruntime_onnx"
instance_group [
count: 1
kind: KIND_GPU
gpus: [ 0 ]
input [
name: "images"
data_type: TYPE_FP32
dims: [16,3,640,640]
output [
name: "output0"
data_type: TYPE_FP32
dims: [16,300,6]
optimization { execution_accelerators {
gpu_execution_accelerator : [ { name : "tensorrt" } ]
So far, tritonserver runs with no troubles.
Code I use to make requests to model in triton raises exception. Code:
from ultralytics import YOLOv10
yolov10_detector = YOLOv10("", task="detect")
import numpy as np
imgs=[np.random.uniform(0.,255.,(h,w,c)).astype(np.uint8) for _ in range(yolov10_batchsz)]
Exception has occurred: InferenceServerException (note: full exception trace is shown but execution is paused at: _run_module_as_main)
[400] unexpected shape for input 'images' for model 'yolov10'. Expected [16,3,640,640], got [1,3,640,640]
File "/home/alex/.local/lib/python3.12/site-packages/tritonclient/http/_utils.py", line 69, in _raise_if_error
raise error
File "/home/alex/.local/lib/python3.12/site-packages/tritonclient/http/_client.py", line 1482, in infer
File "/home/alex/.local/lib/python3.12/site-packages/ultralytics/utils/triton.py", line 90, in __call__
outputs = self.triton_client.infer(model_name=self.endpoint, inputs=infer_inputs, outputs=infer_outputs)
File "/home/alex/.local/lib/python3.12/site-packages/ultralytics/nn/autobackend.py", line 516, in forward
y = self.model(im)
File "/home/alex/.local/lib/python3.12/site-packages/ultralytics/nn/autobackend.py", line 588, in warmup
self.forward(im) # warmup
File "/home/alex/.local/lib/python3.12/site-packages/ultralytics/engine/predictor.py", line 228, in stream_inference
self.model.warmup(imgsz=(1 if self.model.pt or self.model.triton else self.dataset.bs, 3, *self.imgsz))
File "/home/alex/.local/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
response = gen.send(None)
File "/home/alex/.local/lib/python3.12/site-packages/ultralytics/engine/predictor.py", line 168, in __call__
return list(self.stream_inference(source, model, *args, **kwargs)) # merge list of Result into one
File "/home/alex/.local/lib/python3.12/site-packages/ultralytics/engine/model.py", line 441, in predict
return self.predictor.predict_cli(source=source) if is_cli else self.predictor(source=source, stream=stream)
File "/mnt/sdb/faces/video_proto_2/FacesAPI/issue.py", line 22, in <module>
File "/usr/local/lib/python3.12/runpy.py", line 88, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.12/runpy.py", line 198, in _run_module_as_main (Current frame)
return _run_code(code, main_globals, None,
tritonclient.utils.InferenceServerException: [400] unexpected shape for input 'images' for model 'yolov10'. Expected [16,3,640,640], got [1,3,640,640]
Problem is, ultralytics api assumes that models in Tritonserver will always have batchsize=1: /home/alex/.local/lib/python3.12/site-packages/ultralytics/engine/predictor.py", line 228: self.model.warmup(imgsz=(1 if self.model.pt or self.model.triton else self.dataset.bs, 3, *self.imgsz)) which, of course, not always truth.
I sincerely ask to fix this bug.