TensorRT
TensorRT copied to clipboard
🐛 [Bug] Encountered bug when using Torch-TensorRT(torch.nn.TransformerEncoder)
Bug Description
Encountered error as follow when using Torch-TensorRT to convert torch.nn.LSTM in docker image nvcr.io/nvidia/pytorch:23.12-py3 :
To Reproduce
To Reproduce
example code:
import torch
import torch_tensorrt
import torch.nn as nn
class Model(nn.Module):
def __init__(self):
super().__init__()
self.trans = nn.TransformerEncoder(nn.TransformerEncoderLayer(
nhead=8, d_model=2048, dim_feedforward=2048, dropout=0,), num_layers=3)
def forward(self, x):
x = self.trans(x)
return x
model = Model().half().eval().cuda()
inputs = [torch.randn(100, 200, 2048).half().cuda()]
trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs=inputs)
trt_gm(*inputs)
Expected behavior
Environment
Build information about Torch-TensorRT can be found by turning on debug messages
- Torch-TensorRT Version: 2.2.0a0
- PyTorch Version: 2.2.0a0+81ea7a4
- CPU Architecture:
- OS (e.g., Linux): Linux
- How you installed PyTorch (
conda,pip,libtorch, source): - Build command you used (if compiling from source):
- Are you using local sources or building from archives:
- Python version: 3.10.12
- CUDA version: 12.3
- GPU models and configuration: A100
- Any other relevant information:
Additional context
@narendasan Hi, is there any update for this issue?
This should be supported on main provided you enable FP16 kernels
import torch
import torch_tensorrt
import torch.nn as nn
class Model(nn.Module):
def __init__(self):
super().__init__()
self.trans = nn.TransformerEncoder(nn.TransformerEncoderLayer(
nhead=8, d_model=2048, dim_feedforward=2048, dropout=0,), num_layers=3)
def forward(self, x):
x = self.trans(x)
return x
model = Model().half().eval().cuda()
inputs = [torch.randn(100, 200, 2048).half().cuda()]
trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs=inputs, enabled_precisions={torch.float,torch.half})
trt_gm(*inputs)