TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

🐛 [Bug] Encountered bug when using Torch-TensorRT(torch.nn.TransformerEncoder)

Open johnzlli opened this issue 1 year ago • 2 comments

Bug Description

Encountered error as follow when using Torch-TensorRT to convert torch.nn.LSTM in docker image nvcr.io/nvidia/pytorch:23.12-py3 : image

To Reproduce

To Reproduce

example code:

import torch
import torch_tensorrt
import torch.nn as nn

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.trans = nn.TransformerEncoder(nn.TransformerEncoderLayer(
            nhead=8, d_model=2048, dim_feedforward=2048, dropout=0,), num_layers=3)

    def forward(self, x):
        x = self.trans(x)
        return x

model = Model().half().eval().cuda()
inputs = [torch.randn(100, 200, 2048).half().cuda()]
trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs=inputs)
trt_gm(*inputs)

Expected behavior

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • Torch-TensorRT Version: 2.2.0a0
  • PyTorch Version: 2.2.0a0+81ea7a4
  • CPU Architecture:
  • OS (e.g., Linux): Linux
  • How you installed PyTorch (conda, pip, libtorch, source):
  • Build command you used (if compiling from source):
  • Are you using local sources or building from archives:
  • Python version: 3.10.12
  • CUDA version: 12.3
  • GPU models and configuration: A100
  • Any other relevant information:

Additional context

johnzlli avatar Jan 16 '24 09:01 johnzlli

@narendasan Hi, is there any update for this issue?

johnzlli avatar Apr 03 '24 04:04 johnzlli

This should be supported on main provided you enable FP16 kernels

import torch
import torch_tensorrt
import torch.nn as nn

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.trans = nn.TransformerEncoder(nn.TransformerEncoderLayer(
            nhead=8, d_model=2048, dim_feedforward=2048, dropout=0,), num_layers=3)

    def forward(self, x):
        x = self.trans(x)
        return x

model = Model().half().eval().cuda()
inputs = [torch.randn(100, 200, 2048).half().cuda()]
trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs=inputs, enabled_precisions={torch.float,torch.half})
trt_gm(*inputs)

narendasan avatar Jun 11 '24 19:06 narendasan