TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

GELU Plugin increase the inference time!

Open oreo-lp opened this issue 3 years ago • 1 comments

Description

When I use the GELU Plugin in my project, it increases the inference time. Before GELU Plugin, inference time is 44ms (fp32). After GELU Plugin, inference time is 102ms (fp32). I'm very confused about this phenomenon.

Environment

TensorRT Version: 8.4.0.6 NVIDIA GPU: Tesla T4 NVIDIA Driver Version: 10.2 CUDA Version: 10.2 CUDNN Version: 8.3.2 Operating System: CentOS Baremetal or Container (if so, version): No

Steps To Reproduce

(1) use onnx-graphsurgeon the merge ops of LayerNorm and GELU. (2) use trtexec to generate trt engine: trtexec --onnx=./myonnx.onnx --saveEngine=./myengine.trt --plugins=./libgelu.so --plugins=./liblaynorm.so --verbose (3) use cpp to infer model.

Results:

(1) only layernorm: 44ms (fp32) / 20ms (fp16) (2) layernorm + gelu: 102ms (fp32) / 86ms (fp16)

oreo-lp avatar Aug 25 '22 03:08 oreo-lp

@nvpohanh @kevinch-nv Is this the best practice?

zerollzeng avatar Aug 25 '22 15:08 zerollzeng

It is not recommended to use GeLU plugin.

nvpohanh avatar Dec 02 '22 09:12 nvpohanh

Closing since no activity for more than 3 weeks, please reopen if you still have question, thanks!

ttyio avatar Jan 10 '23 02:01 ttyio