mmpretrain
mmpretrain copied to clipboard
[Bug] When testing a mobilenet_v3-small model trained on a custom RGB-image dataset using mmpretrain, the output image channel order is BGR
Branch
main branch (mmpretrain version)
Describe the bug
After customizing a classification task dataset of RGB images and training a mobilenetv3-small image classification model using mmpretrain, testing using the following instructions revealed that all outputs had the correct category labeling information, but the output resultant image was the bgr channel. Subsequent tests using images of the bgr channel showed that the output resultant images were the rgb channel order. So I think maybe there is some problem with the test code of the model
mim test mmpretrain Badminton-mobilenet-v3-small_bs32.py --checkpoint best_accuracy_top1_epoch_13.pth --out result.pkl
My model training and testing config is as follows
bgr_mean = [
103.53,
116.28,
123.675,
]
bgr_std = [
57.375,
57.12,
58.395,
]
data_preprocessor = dict(
mean=[
123.675,
116.28,
103.53,
],
num_classes=2,
std=[
58.395,
57.12,
57.375,
],
to_rgb=True)
data_root = 'D:\\CodeSpace\\AI\\Badminton\\database\\Video_View_Classification_split'
dataset_type = 'ImageNet'
default_hooks = dict(
checkpoint=dict(
interval=1, max_keep_ckpts=5, save_best='auto', type='CheckpointHook'),
logger=dict(interval=30, type='LoggerHook'),
param_scheduler=dict(type='ParamSchedulerHook'),
sampler_seed=dict(type='DistSamplerSeedHook'),
timer=dict(type='IterTimerHook'),
visualization=dict(
enable=True,
interval=300,
out_dir=None,
show=True,
type='VisualizationHook',
wait_time=5.0))
default_scope = 'mmpretrain'
env_cfg = dict(
cudnn_benchmark=False,
dist_cfg=dict(backend='nccl'),
mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0))
launcher = 'none'
load_from = '.\\work_dirs\\Badminton-mobilenet-v3-small_bs32\\best_accuracy_top1_epoch_13.pth'
log_level = 'INFO'
model = dict(
backbone=dict(
arch='small',
frozen_stages=2,
init_cfg=dict(
checkpoint=
'https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/mobilenet-v3-small_8xb128_in1k_20221114-bd1bfcde.pth',
prefix='backbone',
type='Pretrained'),
type='MobileNetV3'),
head=dict(
act_cfg=dict(type='HSwish'),
dropout_rate=0.2,
in_channels=576,
init_cfg=dict(
bias=0.0, layer='Linear', mean=0.0, std=0.01, type='Normal'),
loss=dict(loss_weight=1.0, type='CrossEntropyLoss'),
mid_channels=[
1024,
],
num_classes=2,
topk=(
1,
5,
),
type='StackedLinearClsHead'),
neck=dict(type='GlobalAveragePooling'),
type='ImageClassifier')
optim_wrapper = dict(
optimizer=dict(lr=0.01, momentum=0.9, type='SGD', weight_decay=0.0001))
param_scheduler = dict(
by_epoch=True, gamma=0.1, milestones=[
15,
], type='MultiStepLR')
randomness = dict(deterministic=False, seed=None)
resume = False
test_cfg = dict()
test_dataloader = dict(
batch_size=128,
collate_fn=dict(type='default_collate'),
dataset=dict(
ann_file='',
data_prefix='test',
data_root=
'D:\\CodeSpace\\AI\\Badminton\\database\\Video_View_Classification_split',
pipeline=[
dict(type='LoadImageFromFile'),
dict(backend='pillow', edge='short', scale=256, type='ResizeEdge'),
dict(crop_size=224, type='CenterCrop'),
dict(type='PackInputs'),
],
split='val',
type='CustomDataset',
with_label=True),
num_workers=5,
persistent_workers=True,
pin_memory=True,
sampler=dict(shuffle=False, type='DefaultSampler'))
test_evaluator = dict(topk=1, type='Accuracy')
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(backend='pillow', edge='short', scale=256, type='ResizeEdge'),
dict(crop_size=224, type='CenterCrop'),
dict(type='PackInputs'),
]
train_cfg = dict(by_epoch=True, max_epochs=15, val_interval=1)
train_dataloader = dict(
batch_size=128,
collate_fn=dict(type='default_collate'),
dataset=dict(
ann_file='',
data_prefix='train',
data_root=
'D:\\CodeSpace\\AI\\Badminton\\database\\Video_View_Classification_split',
pipeline=[
dict(type='LoadImageFromFile'),
dict(backend='pillow', scale=224, type='RandomResizedCrop'),
dict(direction='horizontal', prob=0.5, type='RandomFlip'),
dict(
hparams=dict(pad_val=[
104,
116,
124,
]),
policies='imagenet',
type='AutoAugment'),
dict(
erase_prob=0.2,
fill_color=[
103.53,
116.28,
123.675,
],
fill_std=[
57.375,
57.12,
58.395,
],
max_area_ratio=0.3333333333333333,
min_area_ratio=0.02,
mode='rand',
type='RandomErasing'),
dict(type='PackInputs'),
],
split='train',
type='CustomDataset',
with_label=True),
num_workers=5,
persistent_workers=True,
pin_memory=True,
sampler=dict(shuffle=True, type='DefaultSampler'))
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(backend='pillow', scale=224, type='RandomResizedCrop'),
dict(direction='horizontal', prob=0.5, type='RandomFlip'),
dict(
hparams=dict(pad_val=[
104,
116,
124,
]),
policies='imagenet',
type='AutoAugment'),
dict(
erase_prob=0.2,
fill_color=[
103.53,
116.28,
123.675,
],
fill_std=[
57.375,
57.12,
58.395,
],
max_area_ratio=0.3333333333333333,
min_area_ratio=0.02,
mode='rand',
type='RandomErasing'),
dict(type='PackInputs'),
]
val_cfg = dict()
val_dataloader = dict(
batch_size=128,
collate_fn=dict(type='default_collate'),
dataset=dict(
ann_file='',
data_prefix='val',
data_root=
'D:\\CodeSpace\\AI\\Badminton\\database\\Video_View_Classification_split',
pipeline=[
dict(type='LoadImageFromFile'),
dict(backend='pillow', edge='short', scale=256, type='ResizeEdge'),
dict(crop_size=224, type='CenterCrop'),
dict(type='PackInputs'),
],
split='val',
type='CustomDataset',
with_label=True),
num_workers=5,
persistent_workers=True,
pin_memory=True,
sampler=dict(shuffle=False, type='DefaultSampler'))
val_evaluator = dict(topk=1, type='Accuracy')
vis_backends = [
dict(type='LocalVisBackend'),
]
visualizer = dict(
type='UniversalVisualizer', vis_backends=[
dict(type='LocalVisBackend'),
])
work_dir = './work_dirs\\Badminton-mobilenet-v3-small_bs32'
The system information is as follows:
System environment:
sys.platform: win32
Python: 3.8.18 (default, Sep 11 2023, 13:39:12) [MSC v.1916 64 bit (AMD64)]
CUDA available: True
numpy_random_seed: 2062937103
GPU 0: NVIDIA GeForce GTX 1060 with Max-Q Design
CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8
NVCC: Cuda compilation tools, release 11.8, V11.8.89
MSVC: 用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.37.32822 版
GCC: n/a
PyTorch: 2.0.1
PyTorch compiling details: PyTorch built with:
- C++ Version: 199711
- MSVC 193431937
- Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
- OpenMP 2019
- LAPACK is enabled (usually provided by MKL)
- CPU capability usage: AVX2
- CUDA Runtime 11.8
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_37,code=compute_37
- CuDNN 8.7
- Magma 2.5.4
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.8, CUDNN_VERSION=8.7.0, CXX_COMPILER=C:/cb/pytorch_1000000000000/work/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj /FS -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=OFF, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF,
TorchVision: 0.15.2
OpenCV: 4.8.0
MMEngine: 0.8.4
------------------------------------------------------------
An example of an output error image is as follows
Environment
{'sys.platform': 'win32',
'Python': '3.8.18 (default, Sep 11 2023, 13:39:12) [MSC v.1916 64 bit '
'(AMD64)]',
'CUDA available': True,
'numpy_random_seed': 2147483648,
'GPU 0': 'NVIDIA GeForce GTX 1060 with Max-Q Design',
'CUDA_HOME': 'C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.8',
'NVCC': 'Cuda compilation tools, release 11.8, V11.8.89',
'MSVC': '用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.37.32822 版',
'GCC': 'n/a',
'PyTorch': '2.0.1',
'TorchVision': '0.15.2',
'OpenCV': '4.8.0',
'MMEngine': '0.8.4',
'MMCV': '2.0.1',
'MMPreTrain': '1.0.2+'}
Other information
- mmpretrain has not been modified
- probably opencv reads the image as bgr but doesn't convert to rgb before testing the output
mark
I meet the same question.