mmdeploy
mmdeploy copied to clipboard
Very large drop in perform when use bigger dynamic image with tensorRT
I would like to ask if anyone knows why when deploy and test the same mask_rcnn model (with DCN) using the backend tensorRT, with the image crop to a fixed size (1024x1024) call mmdeploy/tools/test.py test performance is almost normal, while the test result using the whole image is very poor.
Both the deploy config file are base on the configs/mmdet/instance-seg/instance-seg_tensorrt_dynamic-320x320-1344x1344.py.
- the deploy config and test result of crop image: tensorrt deploy config:
_base_ = [
'../_base_/base_instance-seg_dynamic.py',
'../../_base_/backends/tensorrt.py'
]
backend_config = dict(
common_config=dict(max_workspace_size=1 << 31),
model_inputs=[
dict(
input_shapes=dict(
input=dict(
min_shape=[1, 3, 1024, 1024],
opt_shape=[1, 3, 1024, 1024],
max_shape=[1, 3, 1024, 1024]
)))
])
the different part of mmdetection config:
model=dict(
test_cfg=dict(
rpn=dict(
nms_pre=2000,
max_per_img=1000),
rcnn=dict(
max_per_img=200,)))
data=dict(
test=dict(
ann_file='xxx/annotations/val_crop_1024.json',
img_prefix='xxx/images/val_crop_1024/'))
the test log:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.175
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.489
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.081
Average Precision (AP) @[ IoU=0.50 | area= small | maxDets=1000 ] = 0.466
Average Precision (AP) @[ IoU=0.50 | area=medium | maxDets=1000 ] = 0.513
Average Precision (AP) @[ IoU=0.50 | area= large | maxDets=1000 ] = 0.487
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.330
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.730
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.252
Average Recall (AR) @[ IoU=0.50 | area= small | maxDets=1000 ] = 0.722
Average Recall (AR) @[ IoU=0.50 | area=medium | maxDets=1000 ] = 0.702
Average Recall (AR) @[ IoU=0.50 | area= large | maxDets=1000 ] = 0.739
- the deploy config and test result of the whole image (dynamic): tensorrt deploy config
_base_ = [
'../_base_/base_instance-seg_dynamic.py',
'../../_base_/backends/tensorrt.py'
]
backend_config = dict(
common_config=dict(max_workspace_size=1 << 31),
model_inputs=[
dict(
input_shapes=dict(
input=dict(
min_shape=[1, 3, 4171, 3128],
opt_shape=[1, 3, 4600, 3448],
max_shape=[1, 3, 5184, 3456]
)))
])
the different part of mmdetection config:
model=dict(
test_cfg=dict(
rpn=dict(
nms_pre=6000,
max_per_img=6000,),
rcnn=dict(
max_per_img=600,)))
data=dict(
test=dict(
ann_file='xxx/annotations/val_whole.json',
img_prefix='xxx/images/val_whole/'))
the test log
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.001
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.003
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= small | maxDets=1000 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area=medium | maxDets=1000 ] = 0.001
Average Precision (AP) @[ IoU=0.50 | area= large | maxDets=1000 ] = 0.005
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.001
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.002
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.000
Average Recall (AR) @[ IoU=0.50 | area= small | maxDets=1000 ] = 0.000
Average Recall (AR) @[ IoU=0.50 | area=medium | maxDets=1000 ] = 0.000
Average Recall (AR) @[ IoU=0.50 | area= large | maxDets=1000 ] = 0.004
PS: the dataset is not the COCO dataset. I know there is a limit of topK (<=3840) in tensorRT,but still the test result of AP50 of the whole image should be 0.472 for example (test in mmdetectionv2.22.0).
@StevenGerrad Hi, what is the test result of whole image setting with mmdetection/tools/test.py under following config:
model=dict(
test_cfg=dict(
rpn=dict(
nms_pre=6000,
max_per_img=6000,),
rcnn=dict(
max_per_img=600,)))
data=dict(
test=dict(
ann_file='xxx/annotations/val_whole.json',
img_prefix='xxx/images/val_whole/'))
@StevenGerrad Hi, what is the test result of whole image setting with
mmdetection/tools/test.pyunder following config:model=dict( test_cfg=dict( rpn=dict( nms_pre=6000, max_per_img=6000,), rcnn=dict( max_per_img=600,))) data=dict( test=dict( ann_file='xxx/annotations/val_whole.json', img_prefix='xxx/images/val_whole/'))
the test result of whole image on mmdetection:
- nms_pre=6000, max_per_img=6000, max_per_img=600
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.173
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.473
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.084
Average Precision (AP) @[ IoU=0.50 | area= small | maxDets=1000 ] = 0.502
Average Precision (AP) @[ IoU=0.50 | area=medium | maxDets=1000 ] = 0.494
Average Precision (AP) @[ IoU=0.50 | area= large | maxDets=1000 ] = 0.440
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.324
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.705
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.251
Average Recall (AR) @[ IoU=0.50 | area= small | maxDets=1000 ] = 0.689
Average Recall (AR) @[ IoU=0.50 | area=medium | maxDets=1000 ] = 0.685
Average Recall (AR) @[ IoU=0.50 | area= large | maxDets=1000 ] = 0.757
- nms_pre=3840, max_per_img=3840, max_per_img=600
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.170
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.466
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.084
Average Precision (AP) @[ IoU=0.50 | area= small | maxDets=1000 ] = 0.494
Average Precision (AP) @[ IoU=0.50 | area=medium | maxDets=1000 ] = 0.496
Average Precision (AP) @[ IoU=0.50 | area= large | maxDets=1000 ] = 0.441
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.313
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.681
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.243
Average Recall (AR) @[ IoU=0.50 | area= small | maxDets=1000 ] = 0.663
Average Recall (AR) @[ IoU=0.50 | area=medium | maxDets=1000 ] = 0.675
Average Recall (AR) @[ IoU=0.50 | area= large | maxDets=1000 ] = 0.740
(Sorry, AP50 should be 0.466 instead of 0.472 when topK is limited to 3840 which I wrote at last comment)
Hi, maybe you could change the post_processing params for BatchedNMS in the deploy config. As you should know, nms from mmcv would be converted to tensorrt batchednms plugin through rewriting while exporting to onnx.
https://github.com/open-mmlab/mmdeploy/blob/b6b22a1b6fae0dddf7476e567065bd2d15d6b356/configs/mmdet/base/base_static.py#L8
post_processing=dict(
score_threshold=0.05,
confidence_threshold=0.005, # for YOLOv3
iou_threshold=0.5,
max_output_boxes_per_class=200,
pre_top_k=5000,
keep_top_k=100,
background_label_id=-1,
))
I try to change the params of post processing but didn't solve the problem. Recently I checked the variables of programme called by tools/test.py in End2EndModel:
https://github.com/open-mmlab/mmdeploy/blob/83b11bc1ca7227497928a57b56653b76501b1368/mmdeploy/codebase/mmdet/deploy/object_detection_model.py#L198

I found it's abnormal, but I can't debug the variables in TRTWrapper in more detail. Do you have any other methods ? Thank you.
@StevenGerrad Hi, you could debug while using tools/deploy.py and then goes into TRTWrapper.
@StevenGerrad Hi, you could debug while using
tools/deploy.pyand then goes intoTRTWrapper.
Hi, it seems like TRTWrapper will not called by tools/deploy.py. And I don't know the way to check out output in forward function of TRTWrapper:
https://github.com/open-mmlab/mmdeploy/blob/f957284d546e06398841b9a20aa3d3b8ff5b4700/mmdeploy/backend/tensorrt/wrapper.py#L161
@StevenGerrad Make sure you are using deploy config of TensorRT backend such as configs/mmdet/instance-seg/instance-seg_tensorrt_dynamic-320x320-1344x1344.py and the machine is not headless:
https://github.com/open-mmlab/mmdeploy/blob/f957284d546e06398841b9a20aa3d3b8ff5b4700/tools/deploy.py#L361