mmdeploy Inconsistent behavior of the test script in mmdetection vs mmdeploy

Thanks for your bug report. We appreciate it a lot.

Checklist

I have searched related issues but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version.

Describe the bug

A clear and concise description of what the bug is.

Inconsistent behavior of the test script in mmdetection vs mmdeploy from the same config file

config file used

...
classes=  ('person',) # variable assigned
data = dict(
    train=dict(
        img_prefix=data_root + 'images/',
        # classes=classes,
        ann_file=data_root + 'train.json'),
    val=dict(
        img_prefix=data_root + 'images/',
        # classes=classes,
        ann_file=data_root + 'val.json'),
    test=dict(
        img_prefix=data_root + 'images/',
        # classes=classes,
        ann_file=data_root + 'val.json'))
...

The script at mmdetection/tools/test.py does the evaluation without consideration of the classes variable, however the script at mmdeploy/tools/test.py will filter the classes accordingly during evaluation even when the variable has not been passed into the data.train dict resulting in inconsistent test results from the 2 scripts

Reproduction

What command or script did you run?

mmdetection/tools/test.py and mmdeploy/tools/test.py

Did you make any modifications on the code or config? Did you understand what you have modified?

config file modified as shown above

Environment

Please run python tools/check_env.py to collect necessary environment information and paste it here.
You may add addition that may be helpful for locating the problem, such as
- How you installed PyTorch [e.g., pip, conda, source]
- Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

2022-05-24 01:54:53,577 - mmdeploy - INFO - TorchVision: 0.9.0
2022-05-24 01:54:53,577 - mmdeploy - INFO - OpenCV: 4.5.5
2022-05-24 01:54:53,577 - mmdeploy - INFO - MMCV: 1.4.0
2022-05-24 01:54:53,577 - mmdeploy - INFO - MMCV Compiler: GCC 7.3
2022-05-24 01:54:53,577 - mmdeploy - INFO - MMCV CUDA Compiler: 10.2
2022-05-24 01:54:53,577 - mmdeploy - INFO - MMDeploy: 0.4.0+3786856
2022-05-24 01:54:53,578 - mmdeploy - INFO - 

2022-05-24 01:54:53,578 - mmdeploy - INFO - **********Backend information**********
[2022-05-24 01:54:54.020] [mmdeploy] [info] [model.cpp:95] Register 'DirectoryModel'
2022-05-24 01:54:54,133 - mmdeploy - INFO - onnxruntime: 1.8.1  ops_is_avaliable : True
2022-05-24 01:54:54,135 - mmdeploy - INFO - tensorrt: 8.2.4.2   ops_is_avaliable : True
2022-05-24 01:54:54,136 - mmdeploy - INFO - ncnn: None  ops_is_avaliable : False
2022-05-24 01:54:54,137 - mmdeploy - INFO - pplnn_is_avaliable: False
2022-05-24 01:54:54,138 - mmdeploy - INFO - openvino_is_avaliable: False
2022-05-24 01:54:54,138 - mmdeploy - INFO - 

2022-05-24 01:54:54,138 - mmdeploy - INFO - **********Codebase information**********
2022-05-24 01:54:54,140 - mmdeploy - INFO - mmdet:      2.24.1
2022-05-24 01:54:54,140 - mmdeploy - INFO - mmseg:      None
2022-05-24 01:54:54,140 - mmdeploy - INFO - mmcls:      None
2022-05-24 01:54:54,140 - mmdeploy - INFO - mmocr:      None
2022-05-24 01:54:54,140 - mmdeploy - INFO - mmedit:     None
2022-05-24 01:54:54,140 - mmdeploy - INFO - mmdet3d:    None
2022-05-24 01:54:54,140 - mmdeploy - INFO - mmpose:     None

Error traceback

If applicable, paste the error trackback here.

A placeholder for trackback.

Bug fix

If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

May 24 '22 03:05 ernestlwt

ok, let me try to make a minimal preproduction.

May 24 '22 15:05 tpoisonooo

I probably know what the problem is:

build_object_detection_model() needs get_classes_from_config()
The model from model = task_processor.init_backend_model(args.model) may be incorrect.

Can you give me $MODEL_CONFIG and $MODEL_PATH ? I have tried atss and everything is fine.

May 25 '22 15:05 tpoisonooo

This is the configuration file i am using and i am retraining on the model from the mmdetection tutorial faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth

_base_ = [
    '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
]
# We also need to change the num_classes in head to match the dataset's annotation
# model = dict(
#     roi_head=dict(
#         bbox_head=dict(
#             type='Shared2FCBBoxHead',
#             num_classes=1,))
# )
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
image_size = (1333, 800)
train_pipeline = [
    dict(type='LoadAnnotations', with_bbox=True),
    dict(
        type='Resize', # not this
        img_scale=image_size,
        ratio_range=(0.1, 2.0),
        multiscale_mode='range',
        keep_ratio=True),
    dict(
        type='RandomCrop', # not this
        crop_type='absolute_range',
        crop_size=image_size,
        recompute_bbox=True,
        allow_negative_crop=True),
    dict(type='FilterAnnotations', min_gt_bbox_wh=(1e-2, 1e-2)), # not this
    dict(type='RandomFlip', flip_ratio=0.5),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size=image_size), 
    # dict(type='Pad', size=32),  
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
]
test_pipeline = [
    dict(
        type='MultiScaleFlipAug',
        img_scale=(1333, 800),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='Pad', size_divisor=32),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img']),
        ])
]

# Modify dataset related settings
dataset_type = 'COCODataset'
# data_root = 'data/coco_mini/'
data_root = '/root/workspace/mmdetection/data/coco_mini/'
# classes=  ('person',)
data = dict(
    train=dict(
        img_prefix=data_root + 'images/',
        # classes=classes,
        ann_file=data_root + 'train.json'),
    val=dict(
        img_prefix=data_root + 'images/',
        # classes=classes,
        ann_file=data_root + 'val.json'),
    test=dict(
        img_prefix=data_root + 'images/',
        # classes=classes,
        ann_file=data_root + 'val.json'))

# We can use the pre-trained Mask RCNN model to obtain higher performance
load_from = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
evaluation = dict(interval=1, metric='bbox', save_best='bbox_mAP_50', classwise=True)
runner = dict(_delete_=True, type='EpochBasedRunner', max_epochs=12)

May 30 '22 01:05 ernestlwt

Thanks for your config file, I think I have reproduce the case.

As listed before, this sniplet:

first attempts to find user input
if fails, gets it from the model configuration
if fails again, generate the sequence by len(CLASSES)

In mmdet customize dataset tutorial, user need to specify CLASSES.

So I think mmdeploy meets the original needs of model trainer: prioritize user input, mmdeploy should not fix it.

Let me open a mmdet issue to talk about this inconsistence.

May 30 '22 13:05 tpoisonooo

close issue for no reply, If you have further discussion, please reopen it.

Aug 29 '22 09:08 tpoisonooo

mmdeploy mmdeploy copied to clipboard

Inconsistent behavior of the test script in mmdetection vs mmdeploy

mmdeploy
mmdeploy copied to clipboard