yolov7-pose
yolov7-pose copied to clipboard
Yololayer cuda file for batch processing with V2DynamicExt
Hi @nanmi, I want to run the tensorrt model with batchsize 8, but when i infer with batch 8, i am getting results for first batch and the remaining batch outputs are zero. because this model only support for batch 1. So i successfully exported onnx model with dynamic batch size, but when i convert to tensorrt i am gettting below error
ERROR: [TRT]: YoloLayer_TRT_0: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions. ERROR: [TRT]: ModelImporter.cpp:726: While parsing node number 495 [YoloLayer_TRT -> "output0"]: ERROR: [TRT]: ModelImporter.cpp:727: --- Begin node --- ERROR: [TRT]: ModelImporter.cpp:728: input: "745" input: "783" input: "821" input: "859" output: "output0" name: "YoloLayer_TRT_0" op_type: "YoloLayer_TRT"
ERROR: [TRT]: ModelImporter.cpp:729: --- End node --- ERROR: [TRT]: ModelImporter.cpp:732: ERROR: ModelImporter.cpp:185 In function parseGraph: [6] Invalid Node - YoloLayer_TRT_0 YoloLayer_TRT_0: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
I'm having the same problem with dynamic batch conversion (same error message), has anyone solved this problem yet?
Here is my procedure:
-
pt > onnx
python3 export.py --weights yolov7-w6-pose.pt --img-size 960 --fp16 --dynamic-batch --device=0
-
insert yololayer
python3 yolov7-pose/YoloLayer_TRT_v7.0/script/add_custom_yolo_op.py
with
dim_outputs = n_maxoutput * n_outputs + 1 # (1000 x 57 +1)
if dynamic_batch :
out_shape = (gs.Tensor.DYNAMIC, dim_outputs, 1, 1)
- onnx > trt engine
/usr/src/tensorrt/bin/trtexec --verbose --onnx=yolov7-w6-pose-dynamic_960-yololayer.onnx --fp16 \
--minShapes=images:1x3x960x960 \
--optShapes=images:12x3x960x960 \
--maxShapes=images:16x3x960x960 \
--saveEngine=yolov7_w6_pose_dy_fp16_960-yololayer.engine \
--plugins=yolov7-pose/YoloLayer_TRT_v7.0/build/libyolo.so
Rewriting the IPluginv2DynamicExt Class instead of IPluginv2Ext Class in yololayer.cu is one solution to support dynamic batch.
But I used yolov8-pose (which supports dynamic batch processing) and integrated it into the Deepstream Python API. Detailed in deepstream-yolo-pose
reference
-
2021/02。PluginV2Layer must be V2DynamicExt when there are runtime input dimensions "I have solved this Issue by Inheriting IPluginv2DynamicExt Class instead of IPluginv2Ext Class."
-
实现TensorRT自定义插件(plugin)自由 9. TensorRT 中的自定义层 - NVIDIA 技术博客 https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/classnvinfer1_1_1_i_plugin_v2.html