TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

Are FastRCNN models from TorchVision supported in TRTorch?

Open saipj opened this issue 3 years ago • 9 comments

❓ Question

Tried FastRCNN and MaskRCNN models from TorchVision. The model fails to compile with error "RuntimeError: tuple appears in op that does not forward tuples, unsupported kind: aten::append"

What you have already tried

code to reproduce: import torch print(torch.version) import trtorch print(trtorch.version) import torchvision

fastrcnn = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True) model = fastrcnn.eval().to("cuda") scripted_model = torch.jit.script(model) compile_settings = { "input_shapes": [ [3, 300, 400],[3, 300, 400] ], "op_precision": torch.float } trt_model = trtorch.compile(scripted_model, compile_settings)

Environment

Build information about the TRTorch compiler can be found by turning on debug messages

  • PyTorch Version (e.g., 1.0): 1.8.1
  • CPU Architecture:
  • OS (e.g., Linux): Ubuntu
  • How you installed PyTorch (conda, pip, libtorch, source): pip
  • Build command you used (if compiling from source): N/A
  • Are you using local sources or building from archives: N/A
  • Python version: python3.7
  • CUDA version: 11.1
  • GPU models and configuration:
  • Any other relevant information: TRTorch - 0.3.0

Additional context

saipj avatar Aug 06 '21 01:08 saipj

Can you share some more information ? Is the error coming from torch.jit.script () call or from trtorch.compile ? If it's a TRTorch error, can you enable debug log by adding trtorch.logging.set_reportable_log_level(trtorch.logging.Level.Debug) to your python script and run ? This can help to identify which layer this might be coming from.

peri044 avatar Aug 06 '21 08:08 peri044

RuntimeError: tuple appears in op that does not forward tuples, unsupported kind: aten::append is from trtorch.compile. torch.jit.script() works fine. I made the changes to the script to enable debug log but didn't get any additional information.

saipj avatar Aug 06 '21 15:08 saipj

I'm running into the exact same problem. torch.jit.script works just fine, but trtorch.compile throws an exception.

Here's a Dockerfile to reproduce the issue:

FROM pytorch/pytorch:1.8.1-cuda11.1-cudnn8-devel

# Have to download and copy over the TensorRT .deb package
COPY nv-tensorrt-repo-ubuntu1804-cuda11.1-trt7.2.3.4-ga-20210226_1-1_amd64.deb ./
RUN dpkg -i nv-tensorrt-repo-ubuntu1804-cuda11.1-trt7.2.3.4-ga-20210226_1-1_amd64.deb

RUN apt update && apt install -y libnvinfer-dev=7.2.3-1+cuda11.1 \
    libnvinfer-plugin-dev=7.2.3-1+cuda11.1 \
    libnvparsers-dev=7.2.3-1+cuda11.1 \
    libnvonnxparsers-dev=7.2.3-1+cuda11.1 \
    libnvinfer-samples=7.2.3-1+cuda11.1 \
    tensorrt

RUN pip install https://github.com/NVIDIA/TRTorch/releases/download/v0.3.0/trtorch-0.3.0-cp38-cp38-linux_x86_64.whl

Running the following script gives the same error as mentioned earlier:

import torch
from torchvision.models.detection import maskrcnn_resnet50_fpn
import trtorch

model = maskrcnn_resnet50_fpn()
scripted = torch.jit.script(model)
trt = trtorch.compile(scripted, {"input_shapes": [(1, 3, 512, 512)]})

Output:

RuntimeError                              Traceback (most recent call last)
<ipython-input-1-bf759ff71e00> in <module>
      5 model = maskrcnn_resnet50_fpn()
      6 scripted = torch.jit.script(model)
----> 7 trt = trtorch.compile(scripted, {"input_shapes": [(1, 3, 512, 512)]})

/opt/conda/lib/python3.8/site-packages/trtorch/_compiler.py in compile(module, compile_spec)
     71             "torch.jit.ScriptFunction currently is not directly supported, wrap the function in a module to compile")
     72 
---> 73     compiled_cpp_mod = trtorch._C.compile_graph(module._c, _parse_compile_spec(compile_spec))
     74     compiled_module = torch.jit._recursive.wrap_cpp_module(compiled_cpp_mod)
     75     return compiled_module

RuntimeError: tuple appears in op that does not forward tuples, unsupported kind: aten::append

fkodom avatar Aug 22 '21 12:08 fkodom

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] avatar Nov 21 '21 00:11 github-actions[bot]

Same Problem.

Please tell me how to resolve the error!!

Error converting detectron2 to TensorRT

I'm converting the Detectron 2 Faster-RCNN FPN Model to Tensor RT.

(1)Convert the Model of Detectron2 to Torchscript and save it in a file (2)Read Torchscript from the converted file and compile it to TensorRT.

The following error has occurred in the compilation of (2) above.

Environment

ubuntu18.04 CUDA 11.3 torch 1.10.1 + cu113 torchvision 0.11.2 + cu113 torch-tensorrt 1.0.0

Error

[W lower_tuples.cpp:209] Warning: tuple appears in op inputs, but this op does not forward tuples, unsupported kind: aten::append (function flattenInputs)
[W lower_tuples.cpp:209] Warning: tuple appears in op inputs, but this op does not forward tuples, unsupported kind: prim::SetAttr (function flattenInputs)
[W lower_tuples.cpp:209] Warning: tuple appears in op inputs, but this op does not forward tuples, unsupported kind: prim::SetAttr (function flattenInputs)
[W lower_tuples.cpp:248] Warning: tuple appears in the op outputs, but this op does not forward tuples, unsupported kind: prim::GetAttr (function flattenOutputs)
Traceback (most recent call last):
  File "tools/compile_torchscript.py", line 61, in <module>
    main()
  File "tools/compile_torchscript.py", line 56, in main
    trt_ts_module = torch_tensorrt.compile(script_model,
  File "/home/ubuntu/detectron2_v0.5_aws/venv/lib/python3.8/site-packages/torch_tensorrt/_compile.py", line 97, in compile
    return torch_tensorrt.ts.compile(ts_mod, inputs=inputs, enabled_precisions=enabled_precisions, **kwargs)
  File "/home/ubuntu/detectron2_v0.5_aws/venv/lib/python3.8/site-packages/torch_tensorrt/ts/_compiler.py", line 119, in compile
    compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec))
RuntimeError: tuple use not matched to tuple construct. Instead found: aten::append

Code

    data_loader = build_detection_test_loader(cfg, cfg.DATASETS.TEST[0])
    first_batch = next(iter(data_loader))
    inputs = [{"image": first_batch[0]["image"]}]
    with torch.no_grad():
        fields = {
            "proposal_boxes": Boxes,
            "objectness_logits": Tensor,
            "pred_boxes": Boxes,
            "scores": Tensor,
            "pred_classes": Tensor,
            "pred_masks": Tensor,
        }
        model = torch.jit.load("model_final.pt")
        script_model = scripting_with_instances(model, fields)
        print(script_model)
        start = perf_counter()
        scripted_instance = \
        script_model.inference(inputs, do_postprocess=False)[0]
        end = perf_counter()
        print(end - start)
        compile_settings = {
            "inputs": [torch_tensorrt.Input(
                min_shape=[1, 3, 224, 224],
                opt_shape=[1, 3, 512, 512],
                max_shape=[1, 3, 1024, 1024],
                dtype=torch.half
                # Datatype of input tensor. Allowed options torch.(float|half|int8|int32|bool)
            )],
            "enabled_precisions": {torch.half},  # Run with FP16
        }
        trt_ts_module = torch_tensorrt.compile(script_model,
                                               **compile_settings)


yoshikakoba avatar Jan 13 '22 04:01 yoshikakoba

Same issue here, debug a little deeper, the root cause is in the model code, there's some python code that manages a tuple list:

        original_image_sizes: List[Tuple[int, int]] = []
        for img in images:
            val = img.shape[-2:]
            assert len(val) == 2
            original_image_sizes.append((val[0], val[1]))

The above python code is converted to TorchScript like this:

      %10809 : (int, int) = prim::TupleConstruct(%10807, %10808)
      %282 : (int, int)[] = aten::append(%original_image_sizes.1, %10809) # /usr/local/lib/python3.6/dist-packages/torchvision/models/detection/generalized_rcnn.py:75:12

Note that %10809 is a tuple.

And in flattenInputs function in torch::jit::LowerAllTuples pass from torch/csrc/jit/passes/lower_tuples.cpp, they have a check:

static void flattenInputs(Node* n, Node* insert_point) {
  // flatten the input list  op(a, tup, b) --> op(a, t0, t1, b)
...
} else {
        TORCH_WARN(
            "tuple appears in op inputs, but this op does not forward tuples, ",
            "unsupported kind: ",
            n->kind().toQualString());
        ++i;
      }
...
}

Note that this is expected, because the element type of the list is indeed a tuple. And since aten::append op is not supported in torch::jit::LowerAllTuples, it's not lowered.

So, in the end, the EnsureNoTuples check failed.

To me it's a bug in PyTorch jit lower process, because when the type of a list is a tuple, aten::append should support tuple lowering. What do you think? @narendasan

void-main avatar Mar 21 '22 07:03 void-main

Besides, as for now, Torch-TensorRT doesn't seem to support models with Tensor[] as input. For example, for FasterRCNN model, the model is defined as:

graph(%self : __torch__.torchvision.models.detection.faster_rcnn.___torch_mangle_47.FasterRCNN,
      %images.1 : Tensor[],
      %targets.44 : Dict(str, Tensor)[]?):

And Torch-TensorRT requires the inputs to be Union(torch_tensorrt.Input, torch.Tensor).

void-main avatar Mar 21 '22 10:03 void-main

@bowang007 has been working on these models so he might have some insight

narendasan avatar May 18 '22 20:05 narendasan

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] avatar Aug 22 '22 00:08 github-actions[bot]

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] avatar Nov 22 '22 00:11 github-actions[bot]

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] avatar Feb 27 '23 00:02 github-actions[bot]

@bowang007 any news?

Charlyo avatar Apr 18 '23 15:04 Charlyo

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] avatar Jul 18 '23 00:07 github-actions[bot]