onnxruntime_backend Error while Loading YOLOv8 Model with EfficientNMS

Issue Description:

I am encountering an error while trying to load a YOLOv8 model with the EfficientNMS_TRT plugin in TRITON. The specific error message I am receiving is:

vbnet

UNAVAILABLE: Internal: onnx runtime error 1: Load model from /models/yolov8_onnx/1/model.onnx failed: Fatal error: TRT:EfficientNMS_TRT(-1) is not a registered function/op

Steps to Reproduce:

Export YOLOv8 model with EfficientNMS_TRT plugin.
Attempt to load the exported model into TRITON.

Expected Behavior:

The YOLOv8 model with the EfficientNMS_TRT plugin should load into TRITON without any errors.

Actual Behavior:

Encountering the aforementioned error message when trying to load the model into TRITON.

Additional Information:

YOLOv8 model was exported with EfficientNMS_TRT plugin.
The error seems to be related to the EfficientNMS_TRT plugin not being registered.
TRITON version: 23.05-py3-sdk

YOLOv8 Model Export Code:

        input_shape = [1, 3, 640, 640]
    device = 'cpu'
    weights = 'path_to_yolov8_weights.pt'
    topk = 100

    YOLOv8 = YOLO(weights)
    model = YOLOv8.model.fuse().eval()
    
    for m in model.modules():
        optim(m)
        m.to(device)
        
    model.to(device)
    fake_input = torch.randn(input_shape).to(device)
    
    model(fake_input)
        
    save_path = weights.replace('.pt', '.onnx')
    
    onnx_model = torch.onnx.export(
        model,
        fake_input,
        save_path,
        input_names=['images'],
        output_names=['num_dets', 'bboxes', 'scores', 'labels'])
    
    print(f'ONNX export success, saved as {save_path}')

TRITON Loading Code:

    platform: "onnxruntime_onnx"
max_batch_size: 0
input [
{
  name: "images"
  data_type: TYPE_FP32
  dims: [ 1,3,640,640 ]
}
]
output [
{
  name: "output0"
  data_type: TYPE_FP32
  dims: [-1, -1, -1]
}
]

Possible Solutions Attempted:

Verified that the EfficientNMS_TRT plugin is correctly included during model export.
Checked for any compatibility issues between the TRITON version and the ONNX Runtime version.

Request for Assistance:

I'm seeking guidance on how to properly load a YOLOv8 model with the EfficientNMS_TRT plugin in TRITON. Any insights, suggestions, or steps to resolve this issue would be greatly appreciated. Thank you!

Aug 30 '23 07:08 whitewalker11

@tanmayv25 @oandreeva-nv Do you have some insights into loading a YOLOv8 model with the EfficientNMS_TRT plugin in TRITON?

Aug 30 '23 18:08 kthui

@whitewalker11 Did you try specifying the custom op plugin as specified here? https://github.com/triton-inference-server/server/blob/main/docs/user_guide/custom_operations.md#onnx

Aug 30 '23 21:08 tanmayv25

onnxruntime_backend
onnxruntime_backend copied to clipboard

Error while Loading YOLOv8 Model with EfficientNMS_TRT Plugin in TRITON

onnxruntime_backend onnxruntime_backend copied to clipboard

Error while Loading YOLOv8 Model with EfficientNMS_TRT Plugin in TRITON

onnxruntime_backend
onnxruntime_backend copied to clipboard