FastDeploy icon indicating copy to clipboard operation
FastDeploy copied to clipboard

[Backend] Enable TensorRT BatchedNMSDynamic_TRT plugin

Open jiangjiajun opened this issue 2 years ago • 1 comments

PR types(PR类型)

TensorRT后端

Describe

  • 移除原有PaddleDetection模型部署的Trick逻辑,改为使用TensorRT BatchedNMSDynamic_TRT插件(EfficientNMS_TRT无法对齐所有PaddleDetection检测模型结果)

jiangjiajun avatar Oct 28 '22 02:10 jiangjiajun

新的PR对精度和性能的影响

模型 后端 精度 每个样本Runtime用时 每个样本端到端用时
PP-YOLOE-L PP-TensorRT FP32 51.4% 56.91ms 64.29ms
PP-YOLOE-L PP-TensorRT FP16 46.76ms 52.69ms
PP-YOLOE-L(v0.4) TensorRT FP32 51.4% 10.99ms 67.36ms
PP-YOLOE-L(v0.4) TensorRT FP16 5.44ms 44.14ms
PP-YOLOE-L(dev) TensorRT FP32 51.4% 14.23ms 18.46ms
PP-YOLOE-L(dev) TensorRT FP16 7.83ms 12.11ms
YOLOv3-Dark53 PP-TensorRT FP32 14.41ms 19.08ms
YOLOv3-Dark53 PP-TensorRT FP16 10.89ms 16.43ms
YOLOv3-Dark53(0.4) TensorRT FP32 10.99ms 19.22ms
YOLOv3-Dark53(0.4) TensorRT FP16 6.30ms 13.52ms
YOLOv3-Dark53(dev) TensorRT FP32 10.7831ms 15.20ms
YOLOv3-Dark53(dev) TensorRT FP16 5.03ms 9.38ms

注意:0.4版本中,在跑TensorRT时,检测模型中的NMS被自动从模型拆分出来,放到了后处理。因此会存在0.4版本的Runtime时间快,但端到端总时长慢的现象

jiangjiajun avatar Nov 01 '22 06:11 jiangjiajun