TensorRT GELU Plugin increase the inference time!

GELU Plugin increase the inference time!

Open oreo-lp opened this issue 3 years ago • 1 comments

Description

When I use the GELU Plugin in my project, it increases the inference time. Before GELU Plugin, inference time is 44ms (fp32). After GELU Plugin, inference time is 102ms (fp32). I'm very confused about this phenomenon.

Environment

TensorRT Version: 8.4.0.6 NVIDIA GPU: Tesla T4 NVIDIA Driver Version: 10.2 CUDA Version: 10.2 CUDNN Version: 8.3.2 Operating System: CentOS Baremetal or Container (if so, version): No

Steps To Reproduce

(1) use onnx-graphsurgeon the merge ops of LayerNorm and GELU. (2) use trtexec to generate trt engine: trtexec --onnx=./myonnx.onnx --saveEngine=./myengine.trt --plugins=./libgelu.so --plugins=./liblaynorm.so --verbose (3) use cpp to infer model.