mmdetection-to-tensorrt
mmdetection-to-tensorrt copied to clipboard
inference
Can produce trt but there will be a prompt
"Warning: Encountered known unsupported method torch.Tensor.new_zeros"
But during the test, it prompted a problem.
"[TensorRT] ERROR: (Unnamed Layer* 204) [ElementWise]: dimensions not compatible for elementwise
[TensorRT] ERROR: shapeMachine.cpp (285)-Shape Error in operator(): broadcast with incompatible dimensions
[TensorRT] ERROR: Instruction: CHECK_BROADCAST 167 100
Traceback (most recent call last):
File "demo/inference.py", line 61, in
In mmdetection2trt/mm2trt/mmdet2trt/apis/inference.py line 39 could not konw "result = model(tensor)"
Hi, Thanks for using my repo. In most case this warning is not a big deal, just ignore it. Could you share the model and convert/inference script with me? Let me see what is wrong with the inference.
@grimoire Yes, you did a great job. The model uses a very simple retinanet, let me give you an order. ”Python demo/inference.py ./000000004765.jpg /home/nie/TXT/onnx_mmdet/configs/retinanet/retinanet_r50_fpn_1x_coco.py /home/nie/TXT/onnx_mmdet/retinanet_r50_fpn_1x_coco_20200130e.rt2398 ./9mmdet. Below is the code I generated "Import numpy as np import tensorrt import torch from mmdet2trt import mmdet2trt from mmdet2trt.apis.inference import init_detector from mmdet2trt.apis.inference import inference_detector
cfg_path="/home/nie/pkg/mmdetection/configs/retinanet/retinanet_r50_fpn_1x_coco.py" weight_path="/home/nie/pkg/mmdetection/retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth" image_path="./000000004765.jpg" save_path="/home/nie/mmdetection2trt/mm2trt/mm.pth" engine_path = "/home/nie/mmdetection2trt/mm2trt/mm.engine" opt_shape_param=[ [ [1,3,320,320], # min shape [1,3,1280,1280], # optimize shape [1,3,1344,1344], # max shape ] ] max_workspace_size=1<<30 # some module need large workspace, add workspace size when OOM. trt_model = mmdet2trt(cfg_path, weight_path, opt_shape_param=opt_shape_param, fp16_mode=True, max_workspace_size=max_workspace_size) torch.save(trt_model.state_dict(), save_path) trt_model = init_detector(save_path) num_detections, trt_bbox, trt_score, trt_cls = inference_detector(trt_model, image_path, cfg_path, "cuda:0") with open(engine_path, mode='wb') as f: f.write(model_trt.state_dict()['engine'])" But it always reports errors in this place "Shape = tuple(self.context.get_binding_shape(idx))" I am thinking about what might be causing it. Is my script wrong?
@grimoire
I printed something and reported an error
"Batch_size: 1
output_names: 4
output_names: output_0
output_names: output_1
output_names: output_2
output_names: output_3
idx: 1
[TensorRT] ERROR: (Unnamed Layer* 197) [ElementWise]: dimensions not compatible for elementwise
[TensorRT] ERROR: shapeMachine.cpp (285)-Shape Error in operator(): broadcast with incompatible dimensions
[TensorRT] ERROR: Instruction: CHECK_BROADCAST 80 50
context.get_binding_shape(idx): (0)
[TensorRT] ERROR: (Unnamed Layer* 197) [ElementWise]: dimensions not compatible for elementwise
[TensorRT] ERROR: shapeMachine.cpp (285)-Shape Error in operator(): broadcast with incompatible dimensions
[TensorRT] ERROR: Instruction: CHECK_BROADCAST 80 50
Traceback (most recent call last):
File "mm2trt.py", line 24, in
@manhongnie
Hi
Your code works fine on me (official retinanet, and image from coco).
I guess the torch2trt you are using is not the latest one, right? In https://github.com/grimoire/torch2trt_dynamic/blob/master/torch2trt/torch2trt.py
shape = tuple(self.context.get_binding_shape(idx))
is line 394.
Try pull the latest one from https://github.com/grimoire/torch2trt_dynamic.
@grimoire No, I used the latest version, but it was unsuccessful. I guess I don’t know about torch2trt, so there should be problems with my script. Can you provide your script?
@manhongnie
import numpy as np
import tensorrt
import torch
from mmdet2trt import mmdet2trt
from mmdet2trt.apis.inference import init_detector
from mmdet2trt.apis.inference import inference_detector
from os.path import expanduser
home = expanduser("~")
cfg_path=home+"/space/mmdetection/configs/retinanet/retinanet_r50_fpn_1x_coco.py"
weight_path=home+"/torch_checkpoints/mmdet/retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth"
image_path=home+"/space/tmp/mmdet2_test/test/000000004765.jpg"
save_path=home+"/space/tmp/mmdet2_test/mm.pth"
engine_path = home+"/space/tmp/mmdet2_test/mm.engine"
opt_shape_param=[
[
[1,3,320,320], # min shape
[1,3,1280,1280], # optimize shape
[1,3,1344,1344], # max shape
]
]
max_workspace_size=1<<30 # some module need large workspace, add workspace size when OOM.
trt_model = mmdet2trt(cfg_path, weight_path, opt_shape_param=opt_shape_param, fp16_mode=True, max_workspace_size=max_workspace_size)
torch.save(trt_model.state_dict(), save_path)
trt_model = init_detector(save_path)
num_detections, trt_bbox, trt_score, trt_cls = inference_detector(trt_model, image_path, cfg_path, "cuda:0")
print(num_detections, trt_bbox, trt_score, trt_cls)
with open(engine_path, mode='wb') as f:
f.write(trt_model.state_dict()['engine'])
My test code, almost the same as yours.
It seems that there is something wrong with the input shape, could you check the tensor shape feeded to the model in mmdet2trt/apis/inference.py result = model(tensor)
@grimoire I checked the input tensor size is (1, 3, 800, 800), I'm checking it again to see
@grimoire This is the information I printed "(mla) nie@nie0315:~/mmdetection2trt/mm2trt$ python mm2trt.py /home/nie/anaconda3/envs/mla/lib/python3.7/site-packages/setuptools/distutils_patch.py:26: UserWarning: Distutils was imported before Setuptools. This usage is discouraged and may exhibit undesirable behaviors or errors. Please use Setuptools' objects directly or at least import Setuptools first. "Distutils was imported before Setuptools. This usage is discouraged " cls_score: torch.Size([1, 720, 77, 77]) bbox_pred: torch.Size([1, 36, 77, 77]) cls_score: torch.Size([1, 720, 39, 39]) bbox_pred: torch.Size([1, 36, 39, 39]) cls_score: torch.Size([1, 720, 20, 20]) bbox_pred: torch.Size([1, 36, 20, 20]) cls_score: torch.Size([1, 720, 10, 10]) bbox_pred: torch.Size([1, 36, 10, 10]) cls_score: torch.Size([1, 720, 5, 5]) bbox_pred: torch.Size([1, 36, 5, 5]) cls_score: (1, 720, 77, 77) bbox_pred: (1, 36, 77, 77) cls_score: (1, 720, 39, 39) bbox_pred: (1, 36, 39, 39) cls_score: (1, 720, 20, 20) bbox_pred: (1, 36, 20, 20) cls_score: (1, 720, 10, 10) bbox_pred: (1, 36, 10, 10) cls_score: (1, 720, 5, 5) bbox_pred: (1, 36, 5, 5) Warning: Encountered known unsupported method torch.Tensor.new_zeros tensor: tensor([[[[1.1015, 1.1015, 1.1015, ..., 1.1358, 1.1358, 1.1358], [1.1187, 1.1015, 1.1015, ..., 1.1358, 1.1358, 1.1358], [1.1187, 1.1187, 1.1015, ..., 1.1358, 1.1358, 1.1358], ..., [1.0673, 1.0673, 1.0673, ..., 1.1015, 1.0844, 1.0844], [1.0673, 1.0673, 1.0673, ..., 1.1015, 1.1015, 1.1015], [1.0673, 1.0673, 1.0673, ..., 1.1015, 1.1015, 1.1015]],
[[1.1331, 1.1331, 1.1331, ..., 1.1681, 1.1681, 1.1681],
[1.1506, 1.1331, 1.1331, ..., 1.1681, 1.1681, 1.1681],
[1.1506, 1.1506, 1.1331, ..., 1.1681, 1.1681, 1.1681],
...,
[1.0980, 1.0980, 1.0980, ..., 1.1331, 1.1331, 1.1331],
[1.0980, 1.0980, 1.0980, ..., 1.1331, 1.1331, 1.1331],
[1.0980, 1.0980, 1.0980, ..., 1.1331, 1.1331, 1.1331]],
[[1.2457, 1.2457, 1.2457, ..., 1.2805, 1.2805, 1.2805],
[1.2457, 1.2457, 1.2457, ..., 1.2805, 1.2805, 1.2805],
[1.2631, 1.2631, 1.2457, ..., 1.2805, 1.2805, 1.2805],
...,
[1.2108, 1.2108, 1.2108, ..., 1.2457, 1.2457, 1.2457],
[1.2108, 1.2108, 1.2108, ..., 1.2457, 1.2457, 1.2457],
[1.2108, 1.2108, 1.2108, ..., 1.2457, 1.2457, 1.2457]]]],
device='cuda:0')
img_metas: [DataContainer({'filename':'./000000004765.jpg','ori_filename':'./000000004765.jpg','ori_shape': (612, 612, 3),'img_shape': (800, 800 , 3),'pad_shape': (800, 800, 3),'scale_factor': array([1.3071896, 1.3071896, 1.3071896, 1.3071896], dtype=float32),'flip': False,'flip_direction': None, ' img_norm_cfg': {'mean': array([123.675, 116.28, 103.53 ], dtype=float32),'std': array([58.395, 57.12, 57.375], dtype=float32),'to_rgb': True}}) ]
scale_factor: [1.3071896 1.3071896 1.3071896 1.3071896]
scale_factor: tensor([1.3072, 1.3072, 1.3072, 1.3072], device='cuda:0')
batch_size: 1
output_names: 4
output_names: output_0
output_names: output_1
output_names: output_2
output_names: output_3
idx: 1
[TensorRT] ERROR: (Unnamed Layer* 197) [ElementWise]: dimensions not compatible for elementwise [TensorRT] ERROR: shapeMachine.cpp (285)-Shape Error in operator(): broadcast with incompatible dimensions [TensorRT] ERROR: Instruction: CHECK_BROADCAST 39 50 context.get_binding_shape(idx): (0)
[TensorRT] ERROR: (Unnamed Layer* 197) [ElementWise]: dimensions not compatible for elementwise
[TensorRT] ERROR: shapeMachine.cpp (285)-Shape Error in operator(): broadcast with incompatible dimensions
[TensorRT] ERROR: Instruction: CHECK_BROADCAST 39 50
Traceback (most recent call last):
File "mm2trt.py", line 27, in
@manhongnie
I see. The input data is fine.
The shape = tuple(self.context.get_binding_shape(idx))
is still in line 404, did you switch back to the old torch2trt?
If your gpu is gtx2080ti or gtx2070s, could you share the generated pth to me by google drive or some other netdisk?
@grimoire
My graphics card is a 2080 super. I solved the problem. The input size is fixed, and only 800*800 can be set, otherwise an error will be reported if it fails to match. But I encountered a new problem, my cuda reported an error, suggesting illegal memory. What type of graphics card is yours?
"[TensorRT] ERROR: writeArchive.cpp (130)-Cuda Error in writeGlob: 700 (an illegal memory access was encountered)
[TensorRT] ERROR: Parameter check failed at: ../builder/cudnnBuilderUtils.cpp::~ScopedCudaStream::72, condition: cudaStreamDestroy(mStream) failure.
[TensorRT] ERROR: FAILED_ALLOCATION: std::exception
Traceback (most recent call last):
File "mm2trt.py", line 31, in
@manhongnie The repo has been tested on 2080ti, 2070 super, jetson tx2, jetson nano. It is strange that the model should support dynamic input shape. (That is the reason why I folk torch2trt_dynamic from nvidia/torch2trt), are you sure that the result is right?
@grimoire
Not sure, because my code hasn't run through, an error is reported, illegal memory, I don't know why there is such an error, is it related to this?
"
context.get_binding_shape(idx): (1,)
"
All errors are here
"Num_detections: Traceback (most recent call last):
File "mm2trt.py", line 33, in
from os.path import expanduser home = expanduser("~") cfg_path="/home/nie/pkg/mmdetection/configs/retinanet/retinanet_r50_fpn_1x_coco.py" weight_path="/home/nie/pkg/mmdetection/retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth" image_path="./000000004765.jpg" save_path="/home/nie/mmdetection2trt/mm2trt/mm.pth" engine_path = "/home/nie/mmdetection2trt/mm2trt/mm.engine" opt_shape_param=[ [ [1,3,320,320], # min shape [1,3,800,800], # optimize shape [1,3,1344,1344], # max shape ] ] max_workspace_size=1<<30 # some module need large workspace, add workspace size when OOM. trt_model = mmdet2trt(cfg_path, weight_path, opt_shape_param=opt_shape_param, fp16_mode=True, max_workspace_size=max_workspace_size) torch.save(trt_model.state_dict(), save_path) trt_model = init_detector(save_path) #img = cv2.imread(image_path) #img = cv2.resize(img,(800,800)) #cv2.imwrite(image_path, img) num_detections, trt_bbox, trt_score, trt_cls = inference_detector(trt_model, image_path, cfg_path, "cuda:0")
#print(num_detections, trt_bbox, trt_score, trt_cls) print("num_detections: ",num_detections,"\n") print("trt_bbox: ",trt_bbox,"\n") print("trt_score: ",trt_score,"\n") print("trt_cls: ",trt_cls,"\n") with open(engine_path, mode='wb') as f: f.write(trt_model.state_dict()['engine'])" Do I need to send you the pth file? My GPU has the same computing power as 2080 tier. The theory shouldn’t be a problem. Did I fail to do a good job, or did you update it?
@manhongnie You can send the pth ,mmdet2trt/torch2trt code, test script or anything related to me, I will try. It might take some time. You can create a new conda enviroment, deploy every thing needed and see if it works (especially torch2trt_dynamic).
@grimoire Okay, I will recreate an environment for testing, can you send me your tensorrt model?
@manhongnie here it is. The model is created on 2080ti. Trt might do some optimization based on device, so I can not promise this model works on your device. https://drive.google.com/file/d/1Q6x_iHa0H7O1cgB3Q_jWVC049Qz6e6oO/view?usp=sharing
@grimoire What system are you using?
@manhongnie Ubuntu18.04 python 3.7 pytorch 1.4.0 cuda 10.0 gpu driver 440 tensorrt 7.0.0.11 You can test in a new docker
sorry, seems I have restrict the download of the share file. It is unlocked now. By the way, I have done a lot of updates on torch2trt_dynamic/amirstan_plugin and this repo. include some bug fixing. try again and see if it works on your side.
the same error with me