yolov7-pose Yololayer cuda file for batch processing with V2DynamicExt

Hi @nanmi, I want to run the tensorrt model with batchsize 8, but when i infer with batch 8, i am getting results for first batch and the remaining batch outputs are zero. because this model only support for batch 1. So i successfully exported onnx model with dynamic batch size, but when i convert to tensorrt i am gettting below error

ERROR: [TRT]: YoloLayer_TRT_0: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions. ERROR: [TRT]: ModelImporter.cpp:726: While parsing node number 495 [YoloLayer_TRT -> "output0"]: ERROR: [TRT]: ModelImporter.cpp:727: --- Begin node --- ERROR: [TRT]: ModelImporter.cpp:728: input: "745" input: "783" input: "821" input: "859" output: "output0" name: "YoloLayer_TRT_0" op_type: "YoloLayer_TRT"

ERROR: [TRT]: ModelImporter.cpp:729: --- End node --- ERROR: [TRT]: ModelImporter.cpp:732: ERROR: ModelImporter.cpp:185 In function parseGraph: [6] Invalid Node - YoloLayer_TRT_0 YoloLayer_TRT_0: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.

Mar 16 '23 10:03 MariJothi

I'm having the same problem with dynamic batch conversion (same error message), has anyone solved this problem yet?

Here is my procedure:

pt > onnx python3 export.py --weights yolov7-w6-pose.pt --img-size 960 --fp16 --dynamic-batch --device=0
insert yololayer python3 yolov7-pose/YoloLayer_TRT_v7.0/script/add_custom_yolo_op.py with

dim_outputs = n_maxoutput * n_outputs + 1  # (1000 x 57 +1)
if dynamic_batch :
    out_shape = (gs.Tensor.DYNAMIC, dim_outputs, 1, 1)

onnx > trt engine

/usr/src/tensorrt/bin/trtexec --verbose --onnx=yolov7-w6-pose-dynamic_960-yololayer.onnx --fp16 \
 --minShapes=images:1x3x960x960 \
--optShapes=images:12x3x960x960 \
 --maxShapes=images:16x3x960x960 \
 --saveEngine=yolov7_w6_pose_dy_fp16_960-yololayer.engine \
 --plugins=yolov7-pose/YoloLayer_TRT_v7.0/build/libyolo.so

Jun 29 '23 01:06 YunghuiHsu

Rewriting the IPluginv2DynamicExt Class instead of IPluginv2Ext Class in yololayer.cu is one solution to support dynamic batch.

But I used yolov8-pose (which supports dynamic batch processing) and integrated it into the Deepstream Python API. Detailed in deepstream-yolo-pose

reference

2021/02。PluginV2Layer must be V2DynamicExt when there are runtime input dimensions "I have solved this Issue by Inheriting IPluginv2DynamicExt Class instead of IPluginv2Ext Class."
实现TensorRT自定义插件(plugin)自由 9. TensorRT 中的自定义层 - NVIDIA 技术博客 https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/classnvinfer1_1_1_i_plugin_v2.html
DeepStream-Yolo/nvdsinfer_custom_impl_Yolo/yoloPlugins.h at f630b10a8088398251bca7f2f50064b57fab06bb · marcoslucianops/DeepStream-Yolo (github.com)

Jul 11 '23 03:07 YunghuiHsu

yolov7-pose yolov7-pose copied to clipboard

Yololayer cuda file for batch processing with V2DynamicExt

Here is my procedure:

reference

yolov7-pose
yolov7-pose copied to clipboard