torch2trt
torch2trt copied to clipboard
Indexing and shape incoherent with module conversion
Minimum Reproducible Example
import torch
import torch2trt
USE_TRT = True
class Model(torch.nn.Module):
def __init__(self):
super().__init__()
self.conv = torch.nn.Conv2d(32, 4, kernel_size=1)
def forward(self, inp):
out = self.conv(inp)
return out
def main():
if USE_TRT:
x = torch.ones((1, 32, 16, 16)).cuda().half()
model = Model().half().cuda().eval()
torch2trt.torch2trt(
model, [x], max_batch_size=1, input_names=["input"], output_names=["output"], fp16_mode=True
)
# Test masking.
mask = torch.tensor([True, False], device="cuda")
data = torch.zeros([2, 5], device="cuda").half()
masked_data = data[mask]
try:
# The following assert fails with TRT, works without TRT.
assert list(masked_data.shape) == [1, 5]
except:
print("========Mask Error=========")
print("masked_data.shape =", list(masked_data.shape))
print("ExpectedShape = [1, 5]")
# Test indexing.
idx = torch.tensor([1, 2], device="cuda")
data = torch.zeros(10, device="cuda")
try:
# The following indexing fails with TRT, works without TRT.
data[idx]
except Exception as e:
print("========Indexing Error=========")
print(e)
if __name__ == "__main__":
main()
This throws an error of
========Mask Error=========
masked_data.shape = [0, 2, 5]
ExpectedShape = [1, 5]
========Indexing Error=========
too many indices for tensor of dimension 1
Environment
PyTorch version: 1.4.0 CUDA used to build PyTorch: 10.1
OS: Ubuntu 18.04.4 LTS GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 CMake version: version 3.10.2
Python version: 3.6 Is CUDA available: Yes CUDA runtime version: Could not collect GPU models and configuration: GPU 0: GeForce RTX 2080 Ti GPU 1: Quadro P2000
Nvidia driver version: 430.50
Hi,
It looks like once we call import torch2trt ; torch2trt.torch2trt(....); the torch.Tensor.getitem will be overriden by torch2trt getitem method right at the time. This issue leads to the fact that pure torch.tensor method which originally uses torch.Tensor.getitem such as the data[mask] expands one dimension due to torch2trt getitem method https://github.com/NVIDIA-AI-IOT/torch2trt/blob/master/torch2trt/converters/getitem.py#L32.
One workaround is that when you want to get the tensor in the way of masking, e.g. data[torch.tensor([False, True], device="cuda")], always add one True dimension in the argument ( data[torch.tensor([True, False, True], device="cuda")] )and then get the first element of data (data = data[mask][0])
Another workaround is that we can prepare the data[mask] before torch2trt call. So that at the time you call torch.Tensor.getitem , it has not been overridden by torch2trt yet.
@salexspb I think this is the bug we are seeing
We ran into the same issue. I added debug prints in __enter__
and __exit__
of ConversionHook
. It seems like torch.Tensor.__getitem__
was correctly overridden back to the original method. However, we are still seeing behavior from torch2trt getitem
printout code:
def __enter__(self):
try:
self.method_impl = eval(self.method_str)
except AttributeError:
self.method_impl = None
if self.method_impl:
wrapped = attach_converter(self.ctx, self.method_impl, self.converter, self.method_str)
print("__enter__(): setting {} to {}, it was {}".format(self.method_str, wrapped, eval(self.method_str)))
self._set_method(wrapped)
def __exit__(self, type, val, tb):
if self.method_impl:
print("__exit__(): setting {} to {}".format(self.method_str, self.method_impl))
self._set_method(self.method_impl)
printout __getitem__
only:
__enter__(): setting torch.Tensor.__getitem__ to <function attach_converter.<locals>.wrapper at 0x7f86ec4c7b70>, it was <slot wrapper '__getitem__' of 'torch._C._TensorBase' objects>
__exit__(): setting torch.Tensor.__getitem__ to <slot wrapper '__getitem__' of 'torch._C._TensorBase' objects>
My repro:
import torch
import torch2trt
import tensorrt as trt
import traceback
# works
tensor = torch.rand((7,7))
tensor[tensor != tensor]
print("tensor indexing before conversionCtx passed")
logger = trt.Logger(trt.Logger.INFO)
builder = trt.Builder(logger)
network = builder.create_network()
conversionCtx = torch2trt.ConversionContext(network)
with conversionCtx as cCtx:
pass
# Fails
tensor = torch.rand((7,7))
tensor[tensor != tensor]
I also added import pdb; pdb.set_trace()
in https://github.com/NVIDIA-AI-IOT/torch2trt/blob/master/torch2trt/converters/getitem.py#L32 which doesn't seem to stop the program at all.
I have the same issue here. Do you solve this problem?
This issue should be addressed by #738.