TensorRT ONNX/TensorRT conversion failure for Mask-RCNN model

Description

Am getting shape issues when I try to convert Detectron Mask-RCNN model to onnx (and then to TensorRT), despite following the guide here.

Environment

TensorRT Version: from source NVIDIA GPU: 1 Quadro RTX 6000 NVIDIA Driver Version: 450.51.05 CUDA Version: 11.6 Python Version (if applicable): 3.9.12 PyTorch Version (if applicable): 1.12.1+cu116 Baremetal or Container (if so, version): Baremetal CPU Architecture: x86_64 OS (e.g., Linux): Ubuntu 18.04

Relevant Files

Steps To Reproduce

Followed the steps outlined in this README.md.

Console Log:

(base) adityamishrav5@sixian-ThinkStation-P520:~/Desktop/ir_camera$ python content/TensorRT/samples/python/detectron2/create_onnx.py \
>     --exported_onnx model.onnx \
>     --onnx converted.onnx \
>     --det2_config detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml \
>     --det2_weights detectron/model_final_f10217.pkl \
>     --sample_image new_2.jpg
WARNING:root:Pytorch pre-release version 1.13.0a0+gitd2b8b8f - assuming intent to test it
WARNING:root:Pytorch pre-release version 1.13.0a0+gitd2b8b8f - assuming intent to test it
INFO:ModelHelper:ONNX graph loaded successfully
INFO:ModelHelper:Number of FPN output channels is 256
INFO:ModelHelper:Number of classes is 80
INFO:ModelHelper:First NMS max proposals is 1000
INFO:ModelHelper:First NMS iou threshold is 0.7
INFO:ModelHelper:First NMS score threshold is 0.01
INFO:ModelHelper:First ROIAlign type is ROIAlignV2
INFO:ModelHelper:First ROIAlign pooled size is 7
INFO:ModelHelper:First ROIAlign sampling ratio is 0
INFO:ModelHelper:Second NMS max proposals is 100
INFO:ModelHelper:Second NMS iou threshold is 0.5
INFO:ModelHelper:Second NMS score threshold is 0.05
INFO:ModelHelper:Second ROIAlign type is ROIAlignV2
INFO:ModelHelper:Second ROIAlign pooled size is 14
INFO:ModelHelper:Second ROIAlign sampling ratio is 0
INFO:ModelHelper:Individual mask output resolution is 28x28
INFO:ModelHelper:ONNX graph input shape: [1, 3, 1344, 1344] [NCHW format set]
INFO:ModelHelper:Found Sub node
INFO:ModelHelper:Found Div node
INFO:ModelHelper:Found Conv node
/home/adityamishrav5/Desktop/ir_camera/pytorch/torch/functional.py:482: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /home/adityamishrav5/Desktop/ir_camera/pytorch/aten/src/ATen/native/TensorShape.cpp:3071.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Traceback (most recent call last):
  File "/home/adityamishrav5/Desktop/ir_camera/content/TensorRT/samples/python/detectron2/create_onnx.py", line 658, in <module>
    main(args)
  File "/home/adityamishrav5/Desktop/ir_camera/content/TensorRT/samples/python/detectron2/create_onnx.py", line 639, in main
    det2_gs.process_graph(anchors, args.first_nms_threshold, args.second_nms_threshold)
  File "/home/adityamishrav5/Desktop/ir_camera/content/TensorRT/samples/python/detectron2/create_onnx.py", line 625, in process_graph
    p2, p3, p4, p5 = backbone()
  File "/home/adityamishrav5/Desktop/ir_camera/content/TensorRT/samples/python/detectron2/create_onnx.py", line 437, in backbone
    first_RN_H = first_resnear_input.outputs[0].shape[2]*2.0
AttributeError: 'NoneType' object has no attribute 'outputs'

Sep 12 '22 07:09 amishra791

@azhurkevich Can you help here ^ ^

Sep 13 '22 10:09 zerollzeng

@zerollzeng I am OOTO till 9/19. Will take a look later

Sep 13 '22 19:09 azhurkevich

@azhurkevich I have exactly the same issues. Looking forward to your solution!

Sep 14 '22 12:09 icklerly1

I'm facing the same issue, https://github.com/facebookresearch/detectron2/blob/main/tools/deploy/export_model.py, This works and the model.onnx is generated but TensorRT/samples/python/detectron2/create_onnx.py does not work. The error is same as @amishra791

Running on Jetson Nano. pytorch : 1.12 python : 3.8 TensorRT : 8.0.1

@azhurkevich , can you help?

Sep 19 '22 05:09 xManjunathPatilx

@azhurkevich Anything new on that matter?

Sep 22 '22 07:09 icklerly1

@icklerly1 Not yet, I'll check it out today/tomorrow

Sep 22 '22 16:09 azhurkevich

@patil-506 @icklerly1 I wonder how are you guys able to successfully export the model. For the last couple months I was getting this issue: ImportError: cannot import name 'Caffe2Tracer' from 'detectron2.export' (/opt/conda/lib/python3.8/site-packages/detectron2/export/__init__.py) and it is still not fixed with the latest stuff, it lead to this widely discussed issue (ignore model, this is universal). I think I can fix it by building pytroch from source with BUILD_CAFFE2=1. However, I doubt you've built it from source. As a result I am wondering how you are able to successfully export it as is?

Sep 23 '22 19:09 azhurkevich

@patil-506 Btw, I've never guaranteed that model will work on a Jetson. I think it will, I just haven't tested it myself. First, you should run create_onnx.py on your PC, not Jetson. Mostly because it will be extremely hard to satisfy library requirements on ARM device. You should start using Jetson at a step when you are building TRT engine. Second, you are using old TRT. I specifically mention that TRT must be >= 8.4.1. I recommend reading this issue, it is very similar in some sense.

Sep 23 '22 19:09 azhurkevich

@azhurkevich I wonder if I am doing something fundamentally wrong. I keep getting this error: ImportError: cannot import name 'STABLE_ONNX_OPSET_VERSION' from 'detectron2.export'

Sep 26 '22 01:09 solarflarefx

@solarflarefx It was changed in a recent PR by somebody. It is detectron 2 related stuff, so I cannot help much. Looks like you've already got your answer.

@amishra791 @icklerly1 @patil-506 When it comes to my converter though. When purely ONNX export method was added (no caffe2 tracing), it has also changed node naming in NN graph. Caffe2 still can be accessed when exporting, but it requires PyTorch recompilation with BUILD_CAFFE2=1 env variable. Since node naming has changed, converter will not work as is even through the graph is the same except for that detail. It's pretty easy to fix it, but I think time will be spent best if I will change the converter and cater it towards purely ONNX export method. Mostly because Caffe2 is dead and is deprecated.

If you urgently want it to work with latest Detectron 2. You can download previous version converted graph right here, visualize both downloaded one and one that is exportable now with recompiled PyTorch side by side. Apply required node naming changes and use converter as is, it will work. You have to find all instances of find_node_by_op_input_output_name in create_onnx.py; change 2nd and 3rd arguments based on new node input name and output name. For example: First instance is here. It looks for a Conv node with input named 487 and output named 488. Here it is in the old graph In the new graph it will be here with new input and output name. Take those new names, write them instead of the old ones and that's it. Do it for every instance and it should work, just find, match, replace names in code.

For pure ONNX modified converter I will not give any ETA, because there are other priorities for now, it will take some time to implement.

Sep 26 '22 23:09 azhurkevich

@azhurkevich, thank you for your feedback. The model conversion to onnx and tensorRT worked on jetson nano but the TRT model does not give good inference. Followed the steps you mentioned to create the onnx file in laptop and later onnx=>trt engine in jetson nano.

Now using official release : Jetson Nano: jetpack 4.6.1 python : 3.6.9 tensorrt : 8.2 pytorch : 1.9

Issue:

During onnx => trt conversion, there are lot of warning for workspace not sufficient and tactics are skipped. (same issue when workspace set to =4gb or 8gb).
The TRT engine file does not produce good results. there is no detections for lot of images, and in few which it detected, the object identified is wrong.
updating tensorrt to 8.4, I think this is not straight forward, unless Nvidia provides an update in jetpack release for jetson nano.

@azhurkevich, can you help in this issue?

Sep 27 '22 12:09 xManjunathPatilx

@patil-506 You are not following requirements. Requirements are TensorRT >= 8.4.1. Quick search tells me Jetpack 4.6.1 has TensorRT 8.2, as a result it doesn't work. Majority of the problems come from not following the instructions, please be careful. You'll need JetPack 5.0.2 . Again, I never tested it myself on a Jetson device and never mentioned it will easily work out of the box, so your YMMV

Sep 27 '22 16:09 azhurkevich

@azhurkevich are you planning to work on the pure ONNX converter in the nearest future?

Nov 30 '22 09:11 antoszy

@azhurkevich I am trying to create TensorRT version of the detectron2 maskRCNN model. I understand that your conversion script (create_onnx.py) will not work with the newest detectron2. So I decided to try to convert the onnx model you provided here:

You can download previous version converted graph right here

Is this model already converted with create_onnx.py? I tried to run: trtexec on your model but I get this error:

[11/30/2022-13:47:01] [E] [TRT] parsers/onnx/ModelImporter.cpp:773: While parsing node number 0 [AliasWithName -> "309"]:
[11/30/2022-13:47:01] [E] [TRT] parsers/onnx/ModelImporter.cpp:774: --- Begin node ---
[11/30/2022-13:47:01] [E] [TRT] parsers/onnx/ModelImporter.cpp:775: input: "data"
output: "309"
op_type: "AliasWithName"
attribute {
  name: "name"
  s: "data"
  type: STRING
}
attribute {
  name: "is_backward"
  i: 0
  type: INT
}
domain: "org.pytorch._caffe2"

[11/30/2022-13:47:01] [E] [TRT] parsers/onnx/ModelImporter.cpp:776: --- End node ---
[11/30/2022-13:47:01] [E] [TRT] parsers/onnx/ModelImporter.cpp:778: ERROR: parsers/onnx/builtin_op_importers.cpp:4890 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
[11/30/2022-13:47:01] [E] Failed to parse onnx file
[11/30/2022-13:47:01] [I] Finish parsing network model
[11/30/2022-13:47:01] [E] Parsing model failed
[11/30/2022-13:47:01] [E] Failed to create engine from model or file.
[11/30/2022-13:47:01] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8402] # trtexec --onnx=./1344x1344_model.onnx --saveEngine=test_engine.trt --useCudaGraph

The whole output file: log.txt

Can you help me resolve this problem? Can you upload a maskRCNN model that can be successfully converted to tensorRT and it's converted (.trt) version?

I have also a question regarding your create_onnx.py script: Why do you find_op_by_name with conv layers and take their outputs instead of searching directly for the Resize layers?

Nov 30 '22 14:11 antoszy

@antoszy I already updated detectron 2 converter in October to support pure ONNX tracing, no caffe2. So it should work with latest detectron2. Please follow instructions and it should work, do not use the old caffe2 model since that one is deprecated and not supported anymore

Nov 30 '22 18:11 azhurkevich

for failed to convert trt of converted.onnx, is here below:

converted.onnx 链接: https://pan.baidu.com/s/1Le5k0MjM6-yxcCVBUeNrtQ 提取码: yw2k 复制这段内容后打开百度网盘手机App，操作更方便哦 --来自百度网盘超级会员v1的分享

Dec 06 '22 01:12 zhuimeng2080

I've got exactly the same Problem. I've build my Detectron 2 Model with this command

python detectron2/tools/deploy/export_model.py
--sample-image 1344x1344.jpg
--config-file detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml
--export-method tracing
--format onnx
--output ./
MODEL.WEIGHTS path/to/model_final_f10217.pkl
MODEL.DEVICE cuda

I only changed the number of classes in the config file. I don't think its a problem.

Then I used this command. And I get this error:

python create_onnx.py
--exported_onnx /path/to/model.onnx
--onnx /path/to/converted.onnx
--det2_config /detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml
--det2_weights /model_final_f10217.pkl
--sample_image any_image.jpg

p2, p3, p4, p5 = backbone()

File "C:\Users\Photoheyler\Desktop\YoloV7Instance\TensorRT\samples\python\detectron2\create_onnx.py", line 363, in backbone return p2.outputs[0], p3.outputs[0], p4.outputs[0], p5.outputs[0] AttributeError: 'NoneType' object has no attribute 'outputs'

This is my onnx Model: https://www.dropbox.com/s/hbgjmxhjrhyic8m/export.onnx?dl=0 my Weights: https://www.dropbox.com/s/vcirxplw9b0rouy/model_final.pth?dl=0 my Config Fille: https://www.dropbox.com/s/817il9gpsy6hwor/street_mask_rcnn_R_101_FPN_3x.yaml?dl=0

Dec 08 '22 08:12 Photoheyler

@zhuimeng2080 and @Photoheyler I had exactly the same problems with converting the model to TensorRT. In the end I used MMDeploy to convert and run the model on TensorRT backend. It took me 1 day to install everything correctly for deploying model in C++ application on Windows but it works like charm.

Dec 08 '22 08:12 antoszy

@antoszy Can you gibe me a short description how you converted the model in mmdeploy?

Dec 08 '22 12:12 Photoheyler

Just follow the instructions from the README.md of their repository. You probably interested in sections getting_started, Build for Linux, How to convert model.

Dec 08 '22 12:12 antoszy

@Photoheyler please see this comment https://github.com/NVIDIA/TensorRT/issues/2546#issuecomment-1352532815 and I hope it helps.

Dec 15 '22 04:12 williamf-searidgetech

Hi there! I am sorry but I am facing the exact same issue.

Command: python create_onnx.py --exported_onnx /workspace/model.onnx --onnx /workspace/converted.onnx --det2_config /workspace/detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --det2_weights ~/.torch/iopath_cache/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl --sample_image /workspace/1344x1344.jpg

Error: Traceback (most recent call last): File "create_onnx.py", line 545, in main(args) File "create_onnx.py", line 526, in main det2_gs.process_graph(anchors, args.first_nms_threshold, args.second_nms_threshold) File "create_onnx.py", line 512, in process_graph p2, p3, p4, p5 = backbone() File "create_onnx.py", line 363, in backbone return p2.outputs[0], p3.outputs[0], p4.outputs[0], p5.outputs[0] AttributeError: 'NoneType' object has no attribute 'outputs'

I am using docker container. While my onnx format was created quite easily using the detectron2/tools/deploy/export_mode.py, I am unable to do the converted_onnx part.

Can you help me?

Specifications: Cuda: cuda-11.8.0 + cuDNN-8.6 TensorRT: 8.5.1.7

Feb 27 '23 00:02 RajUpadhyay

@RajUpadhyay This https://github.com/NVIDIA/TensorRT/issues/2546#issuecomment-1374012936

Feb 27 '23 18:02 azhurkevich

@azhurkevich Thank you for the reply. I saw the link you sent me and yes, that could have also worked. I found another way but in a sense, it does the same thing.

Thank you again.

Feb 27 '23 23:02 RajUpadhyay

hello, @RajUpadhyay did you solve it? I faced the same error AttributeError: 'NoneType' object has no attribute 'outputs'

Mar 09 '23 14:03 hannah12356

@hannah12356 Yes, as @azhurkevich mentioned, you can follow the link he gave. You can also follow this link, it does the same thing.

Good Luck!

Mar 09 '23 15:03 RajUpadhyay

thank you @RajUpadhyay

Mar 09 '23 15:03 hannah12356

closing inactive issues, I see there is already WAR, thanks all!

Nov 23 '23 00:11 ttyio

TensorRT TensorRT copied to clipboard

ONNX/TensorRT conversion failure for Mask-RCNN model

Description

Environment

Relevant Files

Steps To Reproduce

TensorRT
TensorRT copied to clipboard