torch2trt icon indicating copy to clipboard operation
torch2trt copied to clipboard

Indexing and shape incoherent with module conversion

Open meetps opened this issue 4 years ago • 4 comments

Minimum Reproducible Example

import torch
import torch2trt

USE_TRT = True


class Model(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = torch.nn.Conv2d(32, 4, kernel_size=1)

    def forward(self, inp):
        out = self.conv(inp)
        return out


def main():
    if USE_TRT:
        x = torch.ones((1, 32, 16, 16)).cuda().half()
        model = Model().half().cuda().eval()
        torch2trt.torch2trt(
            model, [x], max_batch_size=1, input_names=["input"], output_names=["output"], fp16_mode=True
        )

    # Test masking.
    mask = torch.tensor([True, False], device="cuda")
    data = torch.zeros([2, 5], device="cuda").half()
    masked_data = data[mask]
    try:
        # The following assert fails with TRT, works without TRT.
        assert list(masked_data.shape) == [1, 5]
    except:
        print("========Mask Error=========")
        print("masked_data.shape =", list(masked_data.shape))
        print("ExpectedShape = [1, 5]")

    # Test indexing.
    idx = torch.tensor([1, 2], device="cuda")
    data = torch.zeros(10, device="cuda")
    try:
        # The following indexing fails with TRT, works without TRT.
        data[idx]
    except Exception as e:
        print("========Indexing Error=========")
        print(e)


if __name__ == "__main__":
    main()

This throws an error of

========Mask Error=========
masked_data.shape = [0, 2, 5]
ExpectedShape = [1, 5]
========Indexing Error=========
too many indices for tensor of dimension 1

Environment

PyTorch version: 1.4.0 CUDA used to build PyTorch: 10.1

OS: Ubuntu 18.04.4 LTS GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 CMake version: version 3.10.2

Python version: 3.6 Is CUDA available: Yes CUDA runtime version: Could not collect GPU models and configuration: GPU 0: GeForce RTX 2080 Ti GPU 1: Quadro P2000

Nvidia driver version: 430.50

meetps avatar Mar 18 '20 13:03 meetps

Hi,

It looks like once we call import torch2trt ; torch2trt.torch2trt(....); the torch.Tensor.getitem will be overriden by torch2trt getitem method right at the time. This issue leads to the fact that pure torch.tensor method which originally uses torch.Tensor.getitem such as the data[mask] expands one dimension due to torch2trt getitem method https://github.com/NVIDIA-AI-IOT/torch2trt/blob/master/torch2trt/converters/getitem.py#L32.

One workaround is that when you want to get the tensor in the way of masking, e.g. data[torch.tensor([False, True], device="cuda")], always add one True dimension in the argument ( data[torch.tensor([True, False, True], device="cuda")] )and then get the first element of data (data = data[mask][0])

Another workaround is that we can prepare the data[mask] before torch2trt call. So that at the time you call torch.Tensor.getitem , it has not been overridden by torch2trt yet.

yutec-nvidia avatar Apr 28 '20 22:04 yutec-nvidia

@salexspb I think this is the bug we are seeing

We ran into the same issue. I added debug prints in __enter__ and __exit__ of ConversionHook . It seems like torch.Tensor.__getitem__ was correctly overridden back to the original method. However, we are still seeing behavior from torch2trt getitem

printout code:

    def __enter__(self):
        try:
            self.method_impl = eval(self.method_str)
        except AttributeError:
            self.method_impl = None

        if self.method_impl:
            wrapped = attach_converter(self.ctx, self.method_impl, self.converter, self.method_str)
            print("__enter__(): setting {} to {}, it was {}".format(self.method_str, wrapped, eval(self.method_str)))
            self._set_method(wrapped)

    def __exit__(self, type, val, tb):
        if self.method_impl:
            print("__exit__(): setting {} to {}".format(self.method_str, self.method_impl))
            self._set_method(self.method_impl)

printout __getitem__ only:

__enter__(): setting torch.Tensor.__getitem__ to <function attach_converter.<locals>.wrapper at 0x7f86ec4c7b70>, it was <slot wrapper '__getitem__' of 'torch._C._TensorBase' objects>

__exit__(): setting torch.Tensor.__getitem__ to <slot wrapper '__getitem__' of 'torch._C._TensorBase' objects>

My repro:

import torch
import torch2trt
import tensorrt as trt
import traceback

# works
tensor = torch.rand((7,7))
tensor[tensor != tensor]

print("tensor indexing before conversionCtx passed")

logger = trt.Logger(trt.Logger.INFO)
builder = trt.Builder(logger)
network = builder.create_network()

conversionCtx = torch2trt.ConversionContext(network)
with conversionCtx as cCtx:
    pass

# Fails
tensor = torch.rand((7,7))
tensor[tensor != tensor]

I also added import pdb; pdb.set_trace() in https://github.com/NVIDIA-AI-IOT/torch2trt/blob/master/torch2trt/converters/getitem.py#L32 which doesn't seem to stop the program at all.

chebbyChefNEQ avatar May 10 '20 07:05 chebbyChefNEQ

I have the same issue here. Do you solve this problem?

eriche2016 avatar Dec 25 '20 12:12 eriche2016

This issue should be addressed by #738.

chaoz-dev avatar May 20 '22 19:05 chaoz-dev