TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

❓ [Question] Why TensorRT model is slower?

Open geekinglcq opened this issue 2 years ago • 4 comments

❓ Question

Why TensorRT model is slower? I have tried TensorRT in a MHA (multihead attention) model, but found it is even slower than the jit scripted model.

What you have already tried

I tested the original model, the jit scripted model, the jit model after optimization, and the TensorRT model. Then, I found the tensorrt model is not as fast as I expected. The model here is a simple MHA module modified from fairseq so it could pass the compilation.

import time
import tmp_attn
import torch
import tensorrt
import torch_tensorrt as torch_trt


def timer(m, i):
    st = time.time()
    for _ in range(10000):
        m(i, i, i)
    ed = time.time()
    return ed - st


t1 = torch.randn(64, 1, 1280, device="cuda:0")
model = tmp_attn.MultiheadAttention(1280, 8).to("cuda:0")
model2 = torch.jit.script(model)
model3 = torch.jit.optimize_for_inference(model2)
model4 = torch_trt.compile(model, inputs=[t1, t1, t1]).to("cuda:0")

print("Original Model", timer(model, t1))
print("Jit Script Model", timer(model2, t1))
print("Jit Script Model after optimization", timer(model3, t1))
print("TensorRT Model", timer(model4, t1))

I ran these models 10000 times and record the spent time. The output is: Original Model 5.6981117725372314 Jit Script Model 4.5694739818573 Jit Script Model after optimization 3.3332810401916504 TensorRT Model 4.772718667984009

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • PyTorch Version (e.g., 1.0): 1.11.0
  • CPU Architecture: Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz
  • OS (e.g., Linux): Linux, CentOS7
  • How you installed PyTorch (conda, pip, libtorch, source): conda
  • Build command you used (if compiling from source): /
  • Are you using local sources or building from archives: No
  • Python version: 3.7
  • CUDA version: 11.7
  • GPU models and configuration:
  • TensorRT version: 8.2.5.1
  • Torch_tensorrt version: 1.1.0

Additional context

The code of MHA is here. tmp_attn.py

tmp_attn.py.zip

geekinglcq avatar Jun 20 '22 06:06 geekinglcq

More information: I test TensorRT in an Encoder Layer module, basically, it is an Attention module above with some fc (full-connected) layers, layer norm and dropout layers. The results show that TensorRT achieves about 2x speedup.

  MHA EncoderLayer
  time % time %
Original Module 4.46 1 10.77 1
Jit Scripted Module 4.42 0.99 9.612 0.89
Jit Module with Optimization 2.9 0.65 5.775 0.53
TensorRT 4.34 0.97 4.875 0.45

geekinglcq avatar Jun 22 '22 08:06 geekinglcq

I'm getting this output performance using your script:

Original Model 3.2848336696624756
Jit Script Model 2.7592527866363525
Jit Script Model after optimization 2.0758402347564697
TensorRT Model 1.4786508083343506

bowang007 avatar Jul 01 '22 19:07 bowang007

I'm getting this output performance using your script:

Original Model 3.2848336696624756
Jit Script Model 2.7592527866363525
Jit Script Model after optimization 2.0758402347564697
TensorRT Model 1.4786508083343506

Is there any significant difference between your environment and mine?

geekinglcq avatar Jul 02 '22 03:07 geekinglcq

Hi @geekinglcq here is my env details:

PyTorch Version (e.g., 1.0): 1.11.0
CPU Architecture: Intel(R) Core(TM) i9-7920X CPU @ 2.90GHz
OS (e.g., Linux): Linux, Ubuntu 20.04
How you installed PyTorch (conda, pip, libtorch, source): pip
Build command you used (if compiling from source): /
Are you using local sources or building from archives: yes
Python version: 3.8
CUDA version: 11.3
GPU models and configuration:
TensorRT version: 8.2.4.2
Torch_tensorrt version: 1.1.0

bowang007 avatar Jul 19 '22 22:07 bowang007

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] avatar Oct 18 '22 00:10 github-actions[bot]

@geekinglcq have you find anythind? I meet the same problem though different model with different tensorrt version

Liujingxiu23 avatar Nov 09 '23 09:11 Liujingxiu23