mmpretrain
mmpretrain copied to clipboard
Got poor performance when fine-tuning.
Checklist
- I have searched related issues but cannot get the expected help.
- I have read related documents and don't know what to do.
Describe the question you meet
I always got poor performance when fine-tuning models on my dataset. I have got a 0.93 F1-score using the Resnet-50 pipeline written by myself, while getting only 0.89 F1-score when using Resnet-50 of mmclassification. My task is a simple binary classification, so I am really puzzle why I cannot get a good performance using mmclassification.
Post related information
- The output of
pip list | grep "mmcv\|mmcls\|^torch"
mmcls 0.23.0 mmcv-full 1.5.2 torch 1.10.2 torchvision 0.11.3
- Your config file if you modified it or created a new one.
_base_ = [
'../_base_/models/resnet50.py',
# '../_base_/datasets/tongdao.py',
'../_base_/schedules/imagenet_bs256.py',
'../_base_/default_runtime.py'
]
# model settings
model = dict(
type='ImageClassifier',
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(3, ),
style='pytorch'),
neck=dict(type='GlobalAveragePooling'),
head=dict(
type='LinearClsHead',
num_classes=2,
in_channels=2048,
loss=dict(type='CrossEntropyLoss', loss_weight=1.0),
topk=(1,),
))
# dataset settings
dataset_type = 'TongdaoDataset'
img_norm_cfg = dict(
mean=[94, 99, 106], std=[51, 53, 53], to_rgb=True)
train_pipeline = [
# dict(type='LoadImageFromFile'),
# dict(type='RandomResizedCrop', size=224),
dict(type='RandomFlip', flip_prob=0.5, direction='horizontal'),
dict(type='RandomFlip', flip_prob=0.5, direction='vertical'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='ToTensor', keys=['gt_label']),
dict(type='Collect', keys=['img', 'gt_label'])
]
test_pipeline = [
# dict(type='LoadImageFromFile'),
# dict(type='Resize', size=(256, -1)),
# dict(type='CenterCrop', crop_size=224),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
]
data = dict(
samples_per_gpu=128,
workers_per_gpu=8,
train=dict(
type=dataset_type,
data_prefix='data/imagenet/train',
pipeline=train_pipeline),
val=dict(
type=dataset_type,
data_prefix='data/imagenet/val',
ann_file='data/imagenet/meta/val.txt',
pipeline=test_pipeline),
test=dict(
# replace `data/val` with `data/test` for standard test
type=dataset_type,
data_prefix='data/imagenet/val',
ann_file='data/imagenet/meta/val.txt',
pipeline=test_pipeline))
evaluation = dict(interval=1, metric=['precision', 'recall', 'f1_score'], metric_options=dict(average_mode='none'))
- Your train log file if you meet the problem during training. [here]
- Other code you modified in the
mmcls
folder. I have customized the dataloader, and override the prepare_data function. Because of the complex preprocess of my data, I didn't use the official pipeline to read images, and I finally get a results dictionary:
results = {
'img': img,
'gt_label': lbl,
'img_shape': img.shape,
'ori_shape': img.shape,
}
where img is the numpy.array of shape (224, 224, 3), lbl is the label.
Can you provide the difference between the two configuration transforms?
Sorry, I am not sure what configuration transforms refer to?
I have got a 0.93 F1-score using the Resnet-50 pipeline written by myself, while getting only 0.89 F1-score when using Resnet-50 of misclassification.
what is the difference between those?
The default config in mmcls is for the ImagNet-1k dataset.
You can use ImagNet pretrain models. Check this.
https://github.com/open-mmlab/mmclassification/blob/master/configs/resnet/resnet50_8xb8_cub.py#L7-L13
This issue will be closed as it is inactive, feel free to re-open it if necessary.