TensorRT 🐛 [Bug] Encountered bug when using Torch-TensorRT(torch.nn.TransformerEncoder)

🐛 [Bug] Encountered bug when using Torch-TensorRT(torch.nn.TransformerEncoder)

Open johnzlli opened this issue 1 year ago • 2 comments

Bug Description

Encountered error as follow when using Torch-TensorRT to convert torch.nn.LSTM in docker image nvcr.io/nvidia/pytorch:23.12-py3 :

To Reproduce

example code:

import torch
import torch_tensorrt
import torch.nn as nn

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.trans = nn.TransformerEncoder(nn.TransformerEncoderLayer(
            nhead=8, d_model=2048, dim_feedforward=2048, dropout=0,), num_layers=3)

    def forward(self, x):
        x = self.trans(x)
        return x

model = Model().half().eval().cuda()
inputs = [torch.randn(100, 200, 2048).half().cuda()]
trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs=inputs)
trt_gm(*inputs)

Expected behavior

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

Torch-TensorRT Version: 2.2.0a0
PyTorch Version: 2.2.0a0+81ea7a4
CPU Architecture:
OS (e.g., Linux): Linux
How you installed PyTorch (conda, pip, libtorch, source):
Build command you used (if compiling from source):
Are you using local sources or building from archives:
Python version: 3.10.12
CUDA version: 12.3
GPU models and configuration: A100
Any other relevant information:

Additional context

Jan 16 '24 09:01 johnzlli

@narendasan Hi, is there any update for this issue?

Apr 03 '24 04:04 johnzlli

This should be supported on main provided you enable FP16 kernels

import torch
import torch_tensorrt
import torch.nn as nn

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.trans = nn.TransformerEncoder(nn.TransformerEncoderLayer(
            nhead=8, d_model=2048, dim_feedforward=2048, dropout=0,), num_layers=3)

    def forward(self, x):
        x = self.trans(x)
        return x

model = Model().half().eval().cuda()
inputs = [torch.randn(100, 200, 2048).half().cuda()]
trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs=inputs, enabled_precisions={torch.float,torch.half})
trt_gm(*inputs)

Jun 11 '24 19:06 narendasan

TensorRT TensorRT copied to clipboard

🐛 [Bug] Encountered bug when using Torch-TensorRT(torch.nn.TransformerEncoder)

Bug Description

To Reproduce

Expected behavior

Environment

Additional context

TensorRT
TensorRT copied to clipboard