mmdetection-to-tensorrt inference

Can produce trt but there will be a prompt "Warning: Encountered known unsupported method torch.Tensor.new_zeros" But during the test, it prompted a problem. "[TensorRT] ERROR: (Unnamed Layer* 204) [ElementWise]: dimensions not compatible for elementwise [TensorRT] ERROR: shapeMachine.cpp (285)-Shape Error in operator(): broadcast with incompatible dimensions [TensorRT] ERROR: Instruction: CHECK_BROADCAST 167 100 Traceback (most recent call last): File "demo/inference.py", line 61, in main() File "demo/inference.py", line 31, in main result = inference_detector(trt_model, image_path, cfg_path, args.device) File "/home/nie/mmdetection2trt/mm2trt/mmdet2trt/apis/inference.py", line 39, in inference_detector result = model(tensor) File "/home/nie/anaconda3/envs/mla/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/home/nie/anaconda3/envs/mla/lib/python3.7/site-packages/torch2trt/torch2trt.py", line 394, in forward shape = tuple(self.context.get_binding_shape(idx)) ValueError: len() should return >= 0"

Jul 30 '20 02:07 manhongnie

In mmdetection2trt/mm2trt/mmdet2trt/apis/inference.py 　 line 39　 could not konw 　 "result = model(tensor)"

Jul 30 '20 02:07 manhongnie

Hi, Thanks for using my repo. In most case this warning is not a big deal, just ignore it. Could you share the model and convert/inference script with me? Let me see what is wrong with the inference.

Jul 30 '20 02:07 grimoire

@grimoire Yes, you did a great job. The model uses a very simple retinanet, let me give you an order. ”Python demo/inference.py ./000000004765.jpg /home/nie/TXT/onnx_mmdet/configs/retinanet/retinanet_r50_fpn_1x_coco.py /home/nie/TXT/onnx_mmdet/retinanet_r50_fpn_1x_coco_20200130e.rt2398 ./9mmdet. Below is the code I generated "Import numpy as np import tensorrt import torch from mmdet2trt import mmdet2trt from mmdet2trt.apis.inference import init_detector from mmdet2trt.apis.inference import inference_detector

cfg_path="/home/nie/pkg/mmdetection/configs/retinanet/retinanet_r50_fpn_1x_coco.py" weight_path="/home/nie/pkg/mmdetection/retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth" image_path="./000000004765.jpg" save_path="/home/nie/mmdetection2trt/mm2trt/mm.pth" engine_path = "/home/nie/mmdetection2trt/mm2trt/mm.engine" opt_shape_param=[ [ [1,3,320,320], # min shape [1,3,1280,1280], # optimize shape [1,3,1344,1344], # max shape ] ] max_workspace_size=1<<30 # some module need large workspace, add workspace size when OOM. trt_model = mmdet2trt(cfg_path, weight_path, opt_shape_param=opt_shape_param, fp16_mode=True, max_workspace_size=max_workspace_size) torch.save(trt_model.state_dict(), save_path) trt_model = init_detector(save_path) num_detections, trt_bbox, trt_score, trt_cls = inference_detector(trt_model, image_path, cfg_path, "cuda:0") with open(engine_path, mode='wb') as f: f.write(model_trt.state_dict()['engine'])" But it always reports errors in this place "Shape = tuple(self.context.get_binding_shape(idx))" I am thinking about what might be causing it. Is my script wrong?

Jul 30 '20 05:07 manhongnie

@grimoire I printed something and reported an error "Batch_size: 1 output_names: 4 output_names: output_0 output_names: output_1 output_names: output_2 output_names: output_3 idx: 1 [TensorRT] ERROR: (Unnamed Layer* 197) [ElementWise]: dimensions not compatible for elementwise [TensorRT] ERROR: shapeMachine.cpp (285)-Shape Error in operator(): broadcast with incompatible dimensions [TensorRT] ERROR: Instruction: CHECK_BROADCAST 80 50 context.get_binding_shape(idx): (0) [TensorRT] ERROR: (Unnamed Layer* 197) [ElementWise]: dimensions not compatible for elementwise [TensorRT] ERROR: shapeMachine.cpp (285)-Shape Error in operator(): broadcast with incompatible dimensions [TensorRT] ERROR: Instruction: CHECK_BROADCAST 80 50 Traceback (most recent call last): File "mm2trt.py", line 24, in num_detections, trt_bbox, trt_score, trt_cls = inference_detector(trt_model, image_path, cfg_path, "cuda:0") File "/home/nie/mmdetection2trt/mm2trt/mmdet2trt/apis/inference.py", line 39, in inference_detector result = model(tensor) File "/home/nie/anaconda3/envs/mla/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/home/nie/anaconda3/envs/mla/lib/python3.7/site-packages/torch2trt/torch2trt.py", line 404, in forward shape = tuple(self.context.get_binding_shape(idx)) ValueError: len() should return >= 0 "

Jul 30 '20 05:07 manhongnie

@manhongnie Hi Your code works fine on me (official retinanet, and image from coco). I guess the torch2trt you are using is not the latest one, right? In https://github.com/grimoire/torch2trt_dynamic/blob/master/torch2trt/torch2trt.py shape = tuple(self.context.get_binding_shape(idx)) is line 394. Try pull the latest one from https://github.com/grimoire/torch2trt_dynamic.

Jul 30 '20 05:07 grimoire

@grimoire No, I used the latest version, but it was unsuccessful. I guess I don’t know about torch2trt, so there should be problems with my script. Can you provide your script?

Jul 30 '20 06:07 manhongnie

@manhongnie

import numpy as np
import tensorrt
import torch
from mmdet2trt import mmdet2trt
from mmdet2trt.apis.inference import init_detector
from mmdet2trt.apis.inference import inference_detector

from os.path import expanduser
home = expanduser("~")
cfg_path=home+"/space/mmdetection/configs/retinanet/retinanet_r50_fpn_1x_coco.py"
weight_path=home+"/torch_checkpoints/mmdet/retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth"
image_path=home+"/space/tmp/mmdet2_test/test/000000004765.jpg"
save_path=home+"/space/tmp/mmdet2_test/mm.pth"
engine_path = home+"/space/tmp/mmdet2_test/mm.engine"
opt_shape_param=[
    [
        [1,3,320,320], # min shape
        [1,3,1280,1280], # optimize shape
        [1,3,1344,1344], # max shape
    ]
]
max_workspace_size=1<<30 # some module need large workspace, add workspace size when OOM.
trt_model = mmdet2trt(cfg_path, weight_path, opt_shape_param=opt_shape_param, fp16_mode=True, max_workspace_size=max_workspace_size)
torch.save(trt_model.state_dict(), save_path)
trt_model = init_detector(save_path)
num_detections, trt_bbox, trt_score, trt_cls = inference_detector(trt_model, image_path, cfg_path, "cuda:0")

print(num_detections, trt_bbox, trt_score, trt_cls)
with open(engine_path, mode='wb') as f:
    f.write(trt_model.state_dict()['engine'])

My test code, almost the same as yours.

It seems that there is something wrong with the input shape, could you check the tensor shape feeded to the model in mmdet2trt/apis/inference.py result = model(tensor)

Jul 30 '20 06:07 grimoire

@grimoire I checked the input tensor size is (1, 3, 800, 800), I'm checking it again to see

Jul 30 '20 06:07 manhongnie

@grimoire This is the information I printed "(mla) nie@nie0315:~/mmdetection2trt/mm2trt$ python mm2trt.py /home/nie/anaconda3/envs/mla/lib/python3.7/site-packages/setuptools/distutils_patch.py:26: UserWarning: Distutils was imported before Setuptools. This usage is discouraged and may exhibit undesirable behaviors or errors. Please use Setuptools' objects directly or at least import Setuptools first. "Distutils was imported before Setuptools. This usage is discouraged " cls_score: torch.Size([1, 720, 77, 77]) bbox_pred: torch.Size([1, 36, 77, 77]) cls_score: torch.Size([1, 720, 39, 39]) bbox_pred: torch.Size([1, 36, 39, 39]) cls_score: torch.Size([1, 720, 20, 20]) bbox_pred: torch.Size([1, 36, 20, 20]) cls_score: torch.Size([1, 720, 10, 10]) bbox_pred: torch.Size([1, 36, 10, 10]) cls_score: torch.Size([1, 720, 5, 5]) bbox_pred: torch.Size([1, 36, 5, 5]) cls_score: (1, 720, 77, 77) bbox_pred: (1, 36, 77, 77) cls_score: (1, 720, 39, 39) bbox_pred: (1, 36, 39, 39) cls_score: (1, 720, 20, 20) bbox_pred: (1, 36, 20, 20) cls_score: (1, 720, 10, 10) bbox_pred: (1, 36, 10, 10) cls_score: (1, 720, 5, 5) bbox_pred: (1, 36, 5, 5) Warning: Encountered known unsupported method torch.Tensor.new_zeros tensor: tensor([[[[1.1015, 1.1015, 1.1015, ..., 1.1358, 1.1358, 1.1358], [1.1187, 1.1015, 1.1015, ..., 1.1358, 1.1358, 1.1358], [1.1187, 1.1187, 1.1015, ..., 1.1358, 1.1358, 1.1358], ..., [1.0673, 1.0673, 1.0673, ..., 1.1015, 1.0844, 1.0844], [1.0673, 1.0673, 1.0673, ..., 1.1015, 1.1015, 1.1015], [1.0673, 1.0673, 1.0673, ..., 1.1015, 1.1015, 1.1015]],

     [[1.1331, 1.1331, 1.1331, ..., 1.1681, 1.1681, 1.1681],
      [1.1506, 1.1331, 1.1331, ..., 1.1681, 1.1681, 1.1681],
      [1.1506, 1.1506, 1.1331, ..., 1.1681, 1.1681, 1.1681],
      ...,
      [1.0980, 1.0980, 1.0980, ..., 1.1331, 1.1331, 1.1331],
      [1.0980, 1.0980, 1.0980, ..., 1.1331, 1.1331, 1.1331],
      [1.0980, 1.0980, 1.0980, ..., 1.1331, 1.1331, 1.1331]],

     [[1.2457, 1.2457, 1.2457, ..., 1.2805, 1.2805, 1.2805],
      [1.2457, 1.2457, 1.2457, ..., 1.2805, 1.2805, 1.2805],
      [1.2631, 1.2631, 1.2457, ..., 1.2805, 1.2805, 1.2805],
      ...,
      [1.2108, 1.2108, 1.2108, ..., 1.2457, 1.2457, 1.2457],
      [1.2108, 1.2108, 1.2108, ..., 1.2457, 1.2457, 1.2457],
      [1.2108, 1.2108, 1.2108, ..., 1.2457, 1.2457, 1.2457]]]],
   device='cuda:0')

img_metas: [DataContainer({'filename':'./000000004765.jpg','ori_filename':'./000000004765.jpg','ori_shape': (612, 612, 3),'img_shape': (800, 800 , 3),'pad_shape': (800, 800, 3),'scale_factor': array([1.3071896, 1.3071896, 1.3071896, 1.3071896], dtype=float32),'flip': False,'flip_direction': None, ' img_norm_cfg': {'mean': array([123.675, 116.28, 103.53 ], dtype=float32),'std': array([58.395, 57.12, 57.375], dtype=float32),'to_rgb': True}}) ]

scale_factor: [1.3071896 1.3071896 1.3071896 1.3071896]

scale_factor: tensor([1.3072, 1.3072, 1.3072, 1.3072], device='cuda:0')

batch_size: 1

output_names: 4

output_names: output_0

output_names: output_1

output_names: output_2

output_names: output_3

idx: 1

[TensorRT] ERROR: (Unnamed Layer* 197) [ElementWise]: dimensions not compatible for elementwise [TensorRT] ERROR: shapeMachine.cpp (285)-Shape Error in operator(): broadcast with incompatible dimensions [TensorRT] ERROR: Instruction: CHECK_BROADCAST 39 50 context.get_binding_shape(idx): (0)

[TensorRT] ERROR: (Unnamed Layer* 197) [ElementWise]: dimensions not compatible for elementwise [TensorRT] ERROR: shapeMachine.cpp (285)-Shape Error in operator(): broadcast with incompatible dimensions [TensorRT] ERROR: Instruction: CHECK_BROADCAST 39 50 Traceback (most recent call last): File "mm2trt.py", line 27, in num_detections, trt_bbox, trt_score, trt_cls = inference_detector(trt_model, image_path, cfg_path, "cuda:0") File "/home/nie/mmdetection2trt/mm2trt/mmdet2trt/apis/inference.py", line 39, in inference_detector result = model(tensor) File "/home/nie/anaconda3/envs/mla/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/home/nie/anaconda3/envs/mla/lib/python3.7/site-packages/torch2trt/torch2trt.py", line 404, in forward shape = tuple(self.context.get_binding_shape(idx)) ValueError: len() should return >= 0 "

Jul 30 '20 06:07 manhongnie

@manhongnie I see. The input data is fine. The shape = tuple(self.context.get_binding_shape(idx)) is still in line 404, did you switch back to the old torch2trt? If your gpu is gtx2080ti or gtx2070s, could you share the generated pth to me by google drive or some other netdisk?

Jul 30 '20 07:07 grimoire

@grimoire My graphics card is a 2080 super. I solved the problem. The input size is fixed, and only 800*800 can be set, otherwise an error will be reported if it fails to match. But I encountered a new problem, my cuda reported an error, suggesting illegal memory. What type of graphics card is yours? "[TensorRT] ERROR: writeArchive.cpp (130)-Cuda Error in writeGlob: 700 (an illegal memory access was encountered) [TensorRT] ERROR: Parameter check failed at: ../builder/cudnnBuilderUtils.cpp::~ScopedCudaStream::72, condition: cudaStreamDestroy(mStream) failure. [TensorRT] ERROR: FAILED_ALLOCATION: std::exception Traceback (most recent call last): File "mm2trt.py", line 31, in f.write(trt_model.state_dict()['engine']) File "/home/nie/anaconda3/envs/mla/lib/python3.7/site-packages/torch/nn/modules/module.py", line 706, in state_dict hook_result = hook(self, destination, prefix, local_metadata) File "/home/nie/anaconda3/envs/mla/lib/python3.7/site-packages/torch2trt/torch2trt.py", line 365, in _on_state_dict state_dict[prefix +'engine'] = bytearray(self.engine.serialize()) TypeError:'NoneType' object is not iterable "

Jul 30 '20 07:07 manhongnie

@manhongnie The repo has been tested on 2080ti, 2070 super, jetson tx2, jetson nano. It is strange that the model should support dynamic input shape. (That is the reason why I folk torch2trt_dynamic from nvidia/torch2trt), are you sure that the result is right?

Jul 30 '20 07:07 grimoire

@grimoire Not sure, because my code hasn't run through, an error is reported, illegal memory, I don't know why there is such an error, is it related to this? " context.get_binding_shape(idx): (1,) " All errors are here "Num_detections: Traceback (most recent call last): File "mm2trt.py", line 33, in print("num_detections: ",num_detections,"\n") File "/home/nie/anaconda3/envs/mla/lib/python3.7/site-packages/torch/tensor.py", line 162, in repr return torch._tensor_str._str(self) File "/home/nie/anaconda3/envs/mla/lib/python3.7/site-packages/torch/_tensor_str.py", line 315, in _str tensor_str = _tensor_str(self, indent) File "/home/nie/anaconda3/envs/mla/lib/python3.7/site-packages/torch/_tensor_str.py", line 213, in _tensor_str formatter = _Formatter(get_summarized_data(self) if summarize else self) File "/home/nie/anaconda3/envs/mla/lib/python3.7/site-packages/torch/_tensor_str.py", line 84, in init value_str ='{}'.format(value) File "/home/nie/anaconda3/envs/mla/lib/python3.7/site-packages/torch/tensor.py", line 418, in format return self.item().format(format_spec) RuntimeError: CUDA error: an illegal memory access was encountered " All my code "Import numpy as np import tensorrt import torch from mmdet2trt import mmdet2trt from mmdet2trt.apis.inference import init_detector from mmdet2trt.apis.inference import inference_detector import cv2

from os.path import expanduser home = expanduser("~") cfg_path="/home/nie/pkg/mmdetection/configs/retinanet/retinanet_r50_fpn_1x_coco.py" weight_path="/home/nie/pkg/mmdetection/retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth" image_path="./000000004765.jpg" save_path="/home/nie/mmdetection2trt/mm2trt/mm.pth" engine_path = "/home/nie/mmdetection2trt/mm2trt/mm.engine" opt_shape_param=[ [ [1,3,320,320], # min shape [1,3,800,800], # optimize shape [1,3,1344,1344], # max shape ] ] max_workspace_size=1<<30 # some module need large workspace, add workspace size when OOM. trt_model = mmdet2trt(cfg_path, weight_path, opt_shape_param=opt_shape_param, fp16_mode=True, max_workspace_size=max_workspace_size) torch.save(trt_model.state_dict(), save_path) trt_model = init_detector(save_path) #img = cv2.imread(image_path) #img = cv2.resize(img,(800,800)) #cv2.imwrite(image_path, img) num_detections, trt_bbox, trt_score, trt_cls = inference_detector(trt_model, image_path, cfg_path, "cuda:0")

#print(num_detections, trt_bbox, trt_score, trt_cls) print("num_detections: ",num_detections,"\n") print("trt_bbox: ",trt_bbox,"\n") print("trt_score: ",trt_score,"\n") print("trt_cls: ",trt_cls,"\n") with open(engine_path, mode='wb') as f: f.write(trt_model.state_dict()['engine'])" Do I need to send you the pth file? My GPU has the same computing power as 2080 tier. The theory shouldn’t be a problem. Did I fail to do a good job, or did you update it?

Jul 30 '20 07:07 manhongnie

@manhongnie You can send the pth ,mmdet2trt/torch2trt code, test script or anything related to me, I will try. It might take some time. You can create a new conda enviroment, deploy every thing needed and see if it works (especially torch2trt_dynamic).

Jul 30 '20 08:07 grimoire

@grimoire Okay, I will recreate an environment for testing, can you send me your tensorrt model?

Jul 30 '20 08:07 manhongnie

@manhongnie here it is. The model is created on 2080ti. Trt might do some optimization based on device, so I can not promise this model works on your device. https://drive.google.com/file/d/1Q6x_iHa0H7O1cgB3Q_jWVC049Qz6e6oO/view?usp=sharing

Jul 30 '20 08:07 grimoire

@grimoire What system are you using?

Jul 30 '20 08:07 manhongnie

@manhongnie Ubuntu18.04 python 3.7 pytorch 1.4.0 cuda 10.0 gpu driver 440 tensorrt 7.0.0.11 You can test in a new docker

Jul 30 '20 08:07 grimoire

sorry, seems I have restrict the download of the share file. It is unlocked now. By the way, I have done a lot of updates on torch2trt_dynamic/amirstan_plugin and this repo. include some bug fixing. try again and see if it works on your side.

Aug 10 '20 04:08 grimoire

the same error with me

Sep 19 '20 11:09 heboyong

mmdetection-to-tensorrt mmdetection-to-tensorrt copied to clipboard

inference

mmdetection-to-tensorrt
mmdetection-to-tensorrt copied to clipboard