mmdeploy icon indicating copy to clipboard operation
mmdeploy copied to clipboard

Very large drop in perform when use bigger dynamic image with tensorRT

Open StevenGerrad opened this issue 3 years ago • 7 comments

I would like to ask if anyone knows why when deploy and test the same mask_rcnn model (with DCN) using the backend tensorRT, with the image crop to a fixed size (1024x1024) call mmdeploy/tools/test.py test performance is almost normal, while the test result using the whole image is very poor.

Both the deploy config file are base on the configs/mmdet/instance-seg/instance-seg_tensorrt_dynamic-320x320-1344x1344.py.

  • the deploy config and test result of crop image: tensorrt deploy config:
_base_ = [
    '../_base_/base_instance-seg_dynamic.py',
    '../../_base_/backends/tensorrt.py'
]
backend_config = dict(
    common_config=dict(max_workspace_size=1 << 31),
    model_inputs=[
        dict(
            input_shapes=dict(
                input=dict(
                    min_shape=[1, 3, 1024, 1024],
                    opt_shape=[1, 3, 1024, 1024],
                    max_shape=[1, 3, 1024, 1024]
                )))
    ])

the different part of mmdetection config:

model=dict(
    test_cfg=dict(
        rpn=dict(
            nms_pre=2000,
            max_per_img=1000),
        rcnn=dict(
            max_per_img=200,)))
data=dict(
    test=dict(
        ann_file='xxx/annotations/val_crop_1024.json',
        img_prefix='xxx/images/val_crop_1024/'))

the test log:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.175
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.489
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.081
 Average Precision  (AP) @[ IoU=0.50      | area= small | maxDets=1000 ] = 0.466
 Average Precision  (AP) @[ IoU=0.50      | area=medium | maxDets=1000 ] = 0.513
 Average Precision  (AP) @[ IoU=0.50      | area= large | maxDets=1000 ] = 0.487
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.330
 Average Recall     (AR) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.730
 Average Recall     (AR) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.252
 Average Recall     (AR) @[ IoU=0.50      | area= small | maxDets=1000 ] = 0.722
 Average Recall     (AR) @[ IoU=0.50      | area=medium | maxDets=1000 ] = 0.702
 Average Recall     (AR) @[ IoU=0.50      | area= large | maxDets=1000 ] = 0.739
  • the deploy config and test result of the whole image (dynamic): tensorrt deploy config
_base_ = [
    '../_base_/base_instance-seg_dynamic.py',
    '../../_base_/backends/tensorrt.py'
]
backend_config = dict(
    common_config=dict(max_workspace_size=1 << 31),
    model_inputs=[
        dict(
            input_shapes=dict(
                input=dict(
                    min_shape=[1, 3, 4171, 3128],
                    opt_shape=[1, 3, 4600, 3448],
                    max_shape=[1, 3, 5184, 3456]
                )))
    ])

the different part of mmdetection config:

model=dict(
    test_cfg=dict(
        rpn=dict(
            nms_pre=6000,
            max_per_img=6000,),
        rcnn=dict(
            max_per_img=600,)))
data=dict(
    test=dict(
        ann_file='xxx/annotations/val_whole.json',
        img_prefix='xxx/images/val_whole/'))

the test log

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.001
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.003
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area= small | maxDets=1000 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=medium | maxDets=1000 ] = 0.001
 Average Precision  (AP) @[ IoU=0.50      | area= large | maxDets=1000 ] = 0.005
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.001
 Average Recall     (AR) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.002
 Average Recall     (AR) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50      | area= small | maxDets=1000 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50      | area=medium | maxDets=1000 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50      | area= large | maxDets=1000 ] = 0.004

PS: the dataset is not the COCO dataset. I know there is a limit of topK (<=3840) in tensorRT,but still the test result of AP50 of the whole image should be 0.472 for example (test in mmdetectionv2.22.0).

StevenGerrad avatar Jul 26 '22 06:07 StevenGerrad

@StevenGerrad Hi, what is the test result of whole image setting with mmdetection/tools/test.py under following config:

model=dict(
    test_cfg=dict(
        rpn=dict(
            nms_pre=6000,
            max_per_img=6000,),
        rcnn=dict(
            max_per_img=600,)))
data=dict(
    test=dict(
        ann_file='xxx/annotations/val_whole.json',
        img_prefix='xxx/images/val_whole/'))

RunningLeon avatar Jul 27 '22 07:07 RunningLeon

@StevenGerrad Hi, what is the test result of whole image setting with mmdetection/tools/test.py under following config:

model=dict(
    test_cfg=dict(
        rpn=dict(
            nms_pre=6000,
            max_per_img=6000,),
        rcnn=dict(
            max_per_img=600,)))
data=dict(
    test=dict(
        ann_file='xxx/annotations/val_whole.json',
        img_prefix='xxx/images/val_whole/'))

the test result of whole image on mmdetection:

  • nms_pre=6000, max_per_img=6000, max_per_img=600
Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.173
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.473
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.084
 Average Precision  (AP) @[ IoU=0.50      | area= small | maxDets=1000 ] = 0.502
 Average Precision  (AP) @[ IoU=0.50      | area=medium | maxDets=1000 ] = 0.494
 Average Precision  (AP) @[ IoU=0.50      | area= large | maxDets=1000 ] = 0.440
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.324
 Average Recall     (AR) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.705
 Average Recall     (AR) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.251
 Average Recall     (AR) @[ IoU=0.50      | area= small | maxDets=1000 ] = 0.689
 Average Recall     (AR) @[ IoU=0.50      | area=medium | maxDets=1000 ] = 0.685
 Average Recall     (AR) @[ IoU=0.50      | area= large | maxDets=1000 ] = 0.757
  • nms_pre=3840, max_per_img=3840, max_per_img=600
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.170
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.466
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.084
 Average Precision  (AP) @[ IoU=0.50      | area= small | maxDets=1000 ] = 0.494
 Average Precision  (AP) @[ IoU=0.50      | area=medium | maxDets=1000 ] = 0.496
 Average Precision  (AP) @[ IoU=0.50      | area= large | maxDets=1000 ] = 0.441
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.313
 Average Recall     (AR) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.681
 Average Recall     (AR) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.243
 Average Recall     (AR) @[ IoU=0.50      | area= small | maxDets=1000 ] = 0.663
 Average Recall     (AR) @[ IoU=0.50      | area=medium | maxDets=1000 ] = 0.675
 Average Recall     (AR) @[ IoU=0.50      | area= large | maxDets=1000 ] = 0.740

(Sorry, AP50 should be 0.466 instead of 0.472 when topK is limited to 3840 which I wrote at last comment)

StevenGerrad avatar Jul 27 '22 12:07 StevenGerrad

Hi, maybe you could change the post_processing params for BatchedNMS in the deploy config. As you should know, nms from mmcv would be converted to tensorrt batchednms plugin through rewriting while exporting to onnx.

https://github.com/open-mmlab/mmdeploy/blob/b6b22a1b6fae0dddf7476e567065bd2d15d6b356/configs/mmdet/base/base_static.py#L8

    post_processing=dict(
        score_threshold=0.05,
        confidence_threshold=0.005,  # for YOLOv3
        iou_threshold=0.5,
        max_output_boxes_per_class=200,
        pre_top_k=5000,
        keep_top_k=100,
        background_label_id=-1,
    ))

RunningLeon avatar Jul 28 '22 06:07 RunningLeon

I try to change the params of post processing but didn't solve the problem. Recently I checked the variables of programme called by tools/test.py in End2EndModel:

https://github.com/open-mmlab/mmdeploy/blob/83b11bc1ca7227497928a57b56653b76501b1368/mmdeploy/codebase/mmdet/deploy/object_detection_model.py#L198

7%6SAQ9DJ15@BA 8}7TP6NM

I found it's abnormal, but I can't debug the variables in TRTWrapper in more detail. Do you have any other methods ? Thank you.

StevenGerrad avatar Aug 05 '22 02:08 StevenGerrad

@StevenGerrad Hi, you could debug while using tools/deploy.py and then goes into TRTWrapper.

RunningLeon avatar Aug 05 '22 06:08 RunningLeon

@StevenGerrad Hi, you could debug while using tools/deploy.py and then goes into TRTWrapper.

Hi, it seems like TRTWrapper will not called by tools/deploy.py. And I don't know the way to check out output in forward function of TRTWrapper:

https://github.com/open-mmlab/mmdeploy/blob/f957284d546e06398841b9a20aa3d3b8ff5b4700/mmdeploy/backend/tensorrt/wrapper.py#L161

StevenGerrad avatar Aug 06 '22 12:08 StevenGerrad

@StevenGerrad Make sure you are using deploy config of TensorRT backend such as configs/mmdet/instance-seg/instance-seg_tensorrt_dynamic-320x320-1344x1344.py and the machine is not headless: https://github.com/open-mmlab/mmdeploy/blob/f957284d546e06398841b9a20aa3d3b8ff5b4700/tools/deploy.py#L361

RunningLeon avatar Aug 08 '22 00:08 RunningLeon