mmpretrain icon indicating copy to clipboard operation
mmpretrain copied to clipboard

[Bug] When testing a mobilenet_v3-small model trained on a custom RGB-image dataset using mmpretrain, the output image channel order is BGR

Open Free-Geter opened this issue 1 year ago • 2 comments

Branch

main branch (mmpretrain version)

Describe the bug

After customizing a classification task dataset of RGB images and training a mobilenetv3-small image classification model using mmpretrain, testing using the following instructions revealed that all outputs had the correct category labeling information, but the output resultant image was the bgr channel. Subsequent tests using images of the bgr channel showed that the output resultant images were the rgb channel order. So I think maybe there is some problem with the test code of the model

mim test mmpretrain Badminton-mobilenet-v3-small_bs32.py --checkpoint best_accuracy_top1_epoch_13.pth --out result.pkl

My model training and testing config is as follows

bgr_mean = [
    103.53,
    116.28,
    123.675,
]
bgr_std = [
    57.375,
    57.12,
    58.395,
]
data_preprocessor = dict(
    mean=[
        123.675,
        116.28,
        103.53,
    ],
    num_classes=2,
    std=[
        58.395,
        57.12,
        57.375,
    ],
    to_rgb=True)
data_root = 'D:\\CodeSpace\\AI\\Badminton\\database\\Video_View_Classification_split'
dataset_type = 'ImageNet'
default_hooks = dict(
    checkpoint=dict(
        interval=1, max_keep_ckpts=5, save_best='auto', type='CheckpointHook'),
    logger=dict(interval=30, type='LoggerHook'),
    param_scheduler=dict(type='ParamSchedulerHook'),
    sampler_seed=dict(type='DistSamplerSeedHook'),
    timer=dict(type='IterTimerHook'),
    visualization=dict(
        enable=True,
        interval=300,
        out_dir=None,
        show=True,
        type='VisualizationHook',
        wait_time=5.0))
default_scope = 'mmpretrain'
env_cfg = dict(
    cudnn_benchmark=False,
    dist_cfg=dict(backend='nccl'),
    mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0))
launcher = 'none'
load_from = '.\\work_dirs\\Badminton-mobilenet-v3-small_bs32\\best_accuracy_top1_epoch_13.pth'
log_level = 'INFO'
model = dict(
    backbone=dict(
        arch='small',
        frozen_stages=2,
        init_cfg=dict(
            checkpoint=
            'https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/mobilenet-v3-small_8xb128_in1k_20221114-bd1bfcde.pth',
            prefix='backbone',
            type='Pretrained'),
        type='MobileNetV3'),
    head=dict(
        act_cfg=dict(type='HSwish'),
        dropout_rate=0.2,
        in_channels=576,
        init_cfg=dict(
            bias=0.0, layer='Linear', mean=0.0, std=0.01, type='Normal'),
        loss=dict(loss_weight=1.0, type='CrossEntropyLoss'),
        mid_channels=[
            1024,
        ],
        num_classes=2,
        topk=(
            1,
            5,
        ),
        type='StackedLinearClsHead'),
    neck=dict(type='GlobalAveragePooling'),
    type='ImageClassifier')
optim_wrapper = dict(
    optimizer=dict(lr=0.01, momentum=0.9, type='SGD', weight_decay=0.0001))
param_scheduler = dict(
    by_epoch=True, gamma=0.1, milestones=[
        15,
    ], type='MultiStepLR')
randomness = dict(deterministic=False, seed=None)
resume = False
test_cfg = dict()
test_dataloader = dict(
    batch_size=128,
    collate_fn=dict(type='default_collate'),
    dataset=dict(
        ann_file='',
        data_prefix='test',
        data_root=
        'D:\\CodeSpace\\AI\\Badminton\\database\\Video_View_Classification_split',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(backend='pillow', edge='short', scale=256, type='ResizeEdge'),
            dict(crop_size=224, type='CenterCrop'),
            dict(type='PackInputs'),
        ],
        split='val',
        type='CustomDataset',
        with_label=True),
    num_workers=5,
    persistent_workers=True,
    pin_memory=True,
    sampler=dict(shuffle=False, type='DefaultSampler'))
test_evaluator = dict(topk=1, type='Accuracy')
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(backend='pillow', edge='short', scale=256, type='ResizeEdge'),
    dict(crop_size=224, type='CenterCrop'),
    dict(type='PackInputs'),
]
train_cfg = dict(by_epoch=True, max_epochs=15, val_interval=1)
train_dataloader = dict(
    batch_size=128,
    collate_fn=dict(type='default_collate'),
    dataset=dict(
        ann_file='',
        data_prefix='train',
        data_root=
        'D:\\CodeSpace\\AI\\Badminton\\database\\Video_View_Classification_split',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(backend='pillow', scale=224, type='RandomResizedCrop'),
            dict(direction='horizontal', prob=0.5, type='RandomFlip'),
            dict(
                hparams=dict(pad_val=[
                    104,
                    116,
                    124,
                ]),
                policies='imagenet',
                type='AutoAugment'),
            dict(
                erase_prob=0.2,
                fill_color=[
                    103.53,
                    116.28,
                    123.675,
                ],
                fill_std=[
                    57.375,
                    57.12,
                    58.395,
                ],
                max_area_ratio=0.3333333333333333,
                min_area_ratio=0.02,
                mode='rand',
                type='RandomErasing'),
            dict(type='PackInputs'),
        ],
        split='train',
        type='CustomDataset',
        with_label=True),
    num_workers=5,
    persistent_workers=True,
    pin_memory=True,
    sampler=dict(shuffle=True, type='DefaultSampler'))
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(backend='pillow', scale=224, type='RandomResizedCrop'),
    dict(direction='horizontal', prob=0.5, type='RandomFlip'),
    dict(
        hparams=dict(pad_val=[
            104,
            116,
            124,
        ]),
        policies='imagenet',
        type='AutoAugment'),
    dict(
        erase_prob=0.2,
        fill_color=[
            103.53,
            116.28,
            123.675,
        ],
        fill_std=[
            57.375,
            57.12,
            58.395,
        ],
        max_area_ratio=0.3333333333333333,
        min_area_ratio=0.02,
        mode='rand',
        type='RandomErasing'),
    dict(type='PackInputs'),
]
val_cfg = dict()
val_dataloader = dict(
    batch_size=128,
    collate_fn=dict(type='default_collate'),
    dataset=dict(
        ann_file='',
        data_prefix='val',
        data_root=
        'D:\\CodeSpace\\AI\\Badminton\\database\\Video_View_Classification_split',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(backend='pillow', edge='short', scale=256, type='ResizeEdge'),
            dict(crop_size=224, type='CenterCrop'),
            dict(type='PackInputs'),
        ],
        split='val',
        type='CustomDataset',
        with_label=True),
    num_workers=5,
    persistent_workers=True,
    pin_memory=True,
    sampler=dict(shuffle=False, type='DefaultSampler'))
val_evaluator = dict(topk=1, type='Accuracy')
vis_backends = [
    dict(type='LocalVisBackend'),
]
visualizer = dict(
    type='UniversalVisualizer', vis_backends=[
        dict(type='LocalVisBackend'),
    ])
work_dir = './work_dirs\\Badminton-mobilenet-v3-small_bs32'

The system information is as follows:

System environment:
    sys.platform: win32
    Python: 3.8.18 (default, Sep 11 2023, 13:39:12) [MSC v.1916 64 bit (AMD64)]
    CUDA available: True
    numpy_random_seed: 2062937103
    GPU 0: NVIDIA GeForce GTX 1060 with Max-Q Design
    CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8
    NVCC: Cuda compilation tools, release 11.8, V11.8.89
    MSVC: 用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.37.32822 版
    GCC: n/a
    PyTorch: 2.0.1
    PyTorch compiling details: PyTorch built with:
  - C++ Version: 199711
  - MSVC 193431937
  - Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
  - OpenMP 2019
  - LAPACK is enabled (usually provided by MKL)
  - CPU capability usage: AVX2
  - CUDA Runtime 11.8
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.7
  - Magma 2.5.4
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.8, CUDNN_VERSION=8.7.0, CXX_COMPILER=C:/cb/pytorch_1000000000000/work/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj /FS -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=OFF, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF, 

    TorchVision: 0.15.2
    OpenCV: 4.8.0
    MMEngine: 0.8.4
------------------------------------------------------------

An example of an output error image is as follows CHEN_Long_CHOU_Tien_Chen_World_Tour_Finals_Group_Stage_frame_20825 jpg_0

Environment

{'sys.platform': 'win32',
 'Python': '3.8.18 (default, Sep 11 2023, 13:39:12) [MSC v.1916 64 bit '
           '(AMD64)]',
 'CUDA available': True,
 'numpy_random_seed': 2147483648,
 'GPU 0': 'NVIDIA GeForce GTX 1060 with Max-Q Design',
 'CUDA_HOME': 'C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.8',
 'NVCC': 'Cuda compilation tools, release 11.8, V11.8.89',
 'MSVC': '用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.37.32822 版',
 'GCC': 'n/a',
 'PyTorch': '2.0.1',
 'TorchVision': '0.15.2',
 'OpenCV': '4.8.0',
 'MMEngine': '0.8.4',
 'MMCV': '2.0.1',
 'MMPreTrain': '1.0.2+'}

Other information

  1. mmpretrain has not been modified
  2. probably opencv reads the image as bgr but doesn't convert to rgb before testing the output

Free-Geter avatar Sep 22 '23 12:09 Free-Geter

mark

lmx1989219 avatar Dec 08 '23 09:12 lmx1989219

I meet the same question.

3maoyap avatar Mar 13 '24 15:03 3maoyap