yolov5 icon indicating copy to clipboard operation
yolov5 copied to clipboard

New way for register nms in onnx for tensorrt onnxruntime openvino

Open triple-Mu opened this issue 2 years ago • 8 comments

The last pr provides a basic solution for exporting onnx and modifying it. This pr improves the last pr so that registered nms is completely dependent on pytorch. Simple yet effective!

triple-Mu avatar Jun 04 '22 15:06 triple-Mu

@triple-Mu thanks for the PR! The easiest argument structure is to simply use an --nms arg that would be handled accordingly for formats that are nms capable.

glenn-jocher avatar Jun 09 '22 23:06 glenn-jocher

This looks good @triple-Mu and the process looks correct (agree with @glenn-jocher comment above).

We have found that the EfficientNMS plugin does not always work very predictably as FP16 support was only merged in recently and so does not work correctly in things like the NVIDIA DeepStream Docker images (on older TensorRT versions). It would be better to use the TensorRT BatchedNMS plugin which has been around longer and is more stable. Given you are already converting cx,cy,w,h to tlbr format it should be easy to update 👍

visualcortex-team avatar Jun 15 '22 01:06 visualcortex-team

FP16 support was only merged in recently

Hi @visualcortex-team , TensorRT supports EfficientNMS fp16 mode from 8.2.4+.

It would be better to use the TensorRT BatchedNMS plugin which has been around longer and is more stable.

But EfficientNMS plugin is much faster than BatchedNMS. BTW, TensorRT release 8.4 GA today.

zhiqwang avatar Jun 15 '22 02:06 zhiqwang

Thanks @zhiqwang . I guess people will either:

  • need to know that the minimum supported version of TensorRT to use EfficientNMS is 8.2.4+ (as it will not throw errors it will just produce no results) - this could be added as a warning when exporting?

or

  • A flag is provided to choose which NMS to export with (BatchedNMS vs EfficientNMS)

visualcortex-team avatar Jun 15 '22 02:06 visualcortex-team

Hi @visualcortex-team

need to know that the minimum supported version of TensorRT to use EfficientNMS is 8.2.4+ (as it will not throw errors it will just produce no results) - this could be added as a warning when exporting?

Agreed!

A flag is provided to choose which NMS to export with (BatchedNMS vs EfficientNMS)

Of course there is no problem supporting BatchedNMS plugin from a technical point of view. Seems that TensorRT made this BatchedNMS plugin according to the TensorFlow's interface, it is very slow, there is no need to support it for me.

zhiqwang avatar Jun 15 '22 03:06 zhiqwang

For future readers:

  • the TensorRT release @zhiqwang spoke about updated the apt packages to version: tensorrt-dev/unknown 8.4.1.5-1+cuda11.6 amd64 which will work correctly with the EfficientNMS plugin.

  • to decode the bounding boxes if used with Nvidia Deepstream you will need a custom decoder implementation in nvdsinfer_custombboxparser.cpp using these mappings (unfortunately none of the others work):

object.left=p_bboxes[4*i];
object.top=p_bboxes[4*i+1];
object.width=(p_bboxes[4*i+2] - object.left);
object.height= (p_bboxes[4*i+3] - object.top);

visualcortex-team avatar Jun 23 '22 22:06 visualcortex-team

Why I can not pass CI test? @glenn-jocher

triple-Mu avatar Jul 22 '22 09:07 triple-Mu

@glenn-jocher I am very happy that yolov5 will support dynamic batch at https://github.com/ultralytics/yolov5/pull/8526. At the same time, I also applied dynamic batch to the registered NMS! --nms default set tensorrt nms into onnx or tf.js nms into model --nms 0 or any int set onnx nms into onnx nms values is the same as max-wh in nms --dynamic default dynamic all axes into onnx or tf.js model not support onnx for tensorrt --dynamic 0 dynamic batch into onnx also support onnx for tensorrt

triple-Mu avatar Jul 28 '22 12:07 triple-Mu

@triple-Mu thanks for the input! We'll make sure to consider the dynamic batching and NMS options in our future development. Your feedback is greatly appreciated!

glenn-jocher avatar Nov 15 '23 14:11 glenn-jocher