torch2trt
torch2trt copied to clipboard
Inconsistent inference results between PyTorch and converted TensorRT model using with Linear operator
Description:
I'm experiencing a discrepancy between the inference results of PyTorch model and the TensorRT model obtained by converting it using the torch2trt tool.
Reproduce
This issue can be reproduced by the following script:
import torch
from torch.nn import Module
from torch2trt import torch2trt
def tensor(x):
return x
para_0 = torch.randn([tensor(128), tensor(20)], dtype=torch.float32).cuda()
para_1 = torch.randn([tensor(400), tensor(20)], dtype=torch.float32).cuda()
para_2 = torch.randn([tensor(400)], dtype=torch.float32).cuda()
class linear(Module):
def forward(self, *args):
return torch.nn.functional.linear(args[0], para_1,para_2,)
model = linear().float().eval().cuda()
model_trt = torch2trt(model, [para_0])
output = model(para_0)
output_trt = model_trt(para_0)
print("output:\n", output)
print("output_trt:\n", output_trt)
print(torch.max(torch.abs(output - output_trt)))
The output is:
output:
tensor([[ -1.6021, -2.1279, -1.8634, ..., -5.0204, -2.6875, 4.2837],
[ -4.6518, -0.0451, -4.4829, ..., 2.4758, -0.8829, 1.4029],
[ -4.1329, -3.9315, -3.4823, ..., 1.7019, 4.5339, -0.5770],
...,
[ -5.8334, -3.6527, -10.3388, ..., 4.0736, 4.8801, 2.3656],
[ -9.5916, -2.6035, 3.5427, ..., 2.6379, 1.1783, -2.3509],
[ 14.1478, 0.0149, 5.9481, ..., 4.8245, 1.2463, 1.0344]],
device='cuda:0')
output_trt:
tensor([[ -1.6010, -2.1274, -1.8624, ..., -5.0214, -2.6872, 4.2846],
[ -4.6523, -0.0443, -4.4821, ..., 2.4744, -0.8844, 1.4006],
[ -4.1328, -3.9333, -3.4799, ..., 1.7038, 4.5343, -0.5795],
...,
[ -5.8338, -3.6521, -10.3373, ..., 4.0745, 4.8798, 2.3658],
[ -9.5902, -2.6025, 3.5415, ..., 2.6374, 1.1800, -2.3479],
[ 14.1483, 0.0171, 5.9471, ..., 4.8244, 1.2461, 1.0341]],
device='cuda:0')
tensor(0.0086, device='cuda:0')
Environment
- torch: 2.1.1
- torch2trt: 0.4.0
- tensorrt: 8.6.1