mmpose
mmpose copied to clipboard
mmpose1.x NOT work for training my own dataset
I follow the repo and train the demo loucst dataset. It works and here is the training log:
01/07 10:54:26 - mmengine - INFO - Epoch(train) [1][50/79] lr: 4.954910e-05 eta: 0:02:59 time: 0.241931 data_time: 0.098693 memory: 1397 loss: 0.006913 loss_kpt: 0.006913 acc_pose: 0.178571
01/07 10:54:29 - mmengine - INFO - Exp name: td-hm_res101_8xb64-210e_locust-160x160_20230107_105408
01/07 10:54:29 - mmengine - INFO - Saving checkpoint at 1 epochs
01/07 10:54:36 - mmengine - INFO - Epoch(train) [2][50/79] lr: 1.286283e-04 eta: 0:01:26 time: 0.096829 data_time: 0.001756 memory: 1396 loss: 0.004748 loss_kpt: 0.004748 acc_pose: 0.407143
01/07 10:54:39 - mmengine - INFO - Exp name: td-hm_res101_8xb64-210e_locust-160x160_20230107_105408
01/07 10:54:39 - mmengine - INFO - Saving checkpoint at 2 epochs
and Then I want to use my own dataset. It has been transformed to coco format. Follow locust dataset, similarly,I get the training log below:
01/07 11:33:50 - mmengine - INFO - Checkpoints will be saved to C:\Work\Proj\PyProj\mmpose-1.x\mmpose-1.x\tools\work_dirs\td-hm_res101_8xb64-210e_chicken-160x160.
01/07 11:34:07 - mmengine - INFO - Epoch(train) [1][50/204] lr: 4.954910e-05 eta: 0:11:45 time: 0.354480 data_time: 0.154573 memory: 2239 loss: 0.000000 loss_kpt: 0.000000 acc_pose: 0.000000
01/07 11:34:18 - mmengine - INFO - Epoch(train) [1][100/204] lr: 9.959920e-05 eta: 0:09:13 time: 0.216220 data_time: 0.065922 memory: 2239 loss: 0.000003 loss_kpt: 0.000003 acc_pose: 0.000000
01/07 11:34:29 - mmengine - INFO - Epoch(train) [1][150/204] lr: 1.496493e-04 eta: 0:08:14 time: 0.213670 data_time: 0.064010 memory: 2239 loss: 0.000000 loss_kpt: 0.000000 acc_pose: 0.000000
01/07 11:34:39 - mmengine - INFO - Epoch(train) [1][200/204] lr: 1.996994e-04 eta: 0:07:36 time: 0.207582 data_time: 0.058121 memory: 2239 loss: 0.000000 loss_kpt: 0.000000 acc_pose: 0.000000
01/07 11:34:40 - mmengine - INFO - Exp name: td-hm_res101_8xb64-210e_chicken-160x160_20230107_113344
01/07 11:34:40 - mmengine - INFO - Saving checkpoint at 1 epochs
01/07 11:34:54 - mmengine - INFO - Epoch(train) [2][50/204] lr: 2.537535e-04 eta: 0:06:57 time: 0.194315 data_time: 0.037169 memory: 2239 loss: 0.000000 loss_kpt: 0.000000 acc_pose: 0.000000
01/07 11:35:04 - mmengine - INFO - Epoch(train) [2][100/204] lr: 3.038036e-04 eta: 0:06:32 time: 0.188513 data_time: 0.035870 memory: 2239 loss: 0.000000 loss_kpt: 0.000000 acc_pose: 0.000000
01/07 11:35:13 - mmengine - INFO - Epoch(train) [2][150/204] lr: 3.538537e-04 eta: 0:06:13 time: 0.193783 data_time: 0.044105 memory: 2239 loss: 0.000000 loss_kpt: 0.000000 acc_pose: 0.000000
the loss 、loss_kpt and acc_pose are always ZERO.
Here are some details:
(1) config file:
_base_ = ['../../../_base_/default_runtime.py' ]
# runtime
train_cfg = dict(max_epochs=10, val_interval=10) #210
# optimizer
optim_wrapper = dict(optimizer=dict(
type='Adam',
lr=5e-4,
))
# learning policy
param_scheduler = [
dict(
type='LinearLR', begin=0, end=500, start_factor=0.001,
by_epoch=False), # warm-up
dict(
type='MultiStepLR',
begin=0,
end=210,
milestones=[170, 200],
gamma=0.1,
by_epoch=True)
]
# automatically scaling LR based on the actual training batch size
auto_scale_lr = dict(base_batch_size=512)
# hooks
default_hooks = dict(checkpoint=dict(save_best='coco/AP', rule='greater'))
# codec settings
codec = dict(
type='MSRAHeatmap', input_size=(160, 160), heatmap_size=(40, 40), sigma=2)
# model settings
model = dict(
type='TopdownPoseEstimator',
data_preprocessor=dict(
type='PoseDataPreprocessor',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
bgr_to_rgb=True),
backbone=dict(
type='ResNet',
depth=101,
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet101'),
),
head=dict(
type='HeatmapHead',
in_channels=2048,
out_channels=10, #change 35 to 10
loss=dict(type='KeypointMSELoss', use_target_weight=True),
decoder=codec),
test_cfg=dict(
flip_test=True,
flip_mode='heatmap',
shift_heatmap=True,
))
# base dataset settings
dataset_type = 'ChickenDataset'
data_mode = 'topdown'
data_root = '../data/chicken/'
# pipelines
train_pipeline = [
dict(type='LoadImage', file_client_args={{_base_.file_client_args}}),
dict(type='GetBBoxCenterScale', padding=0.8),
dict(type='RandomFlip', direction='horizontal'),
dict(
type='RandomBBoxTransform',
shift_factor=0.25,
rotate_factor=180,
scale_factor=(0.7, 1.3)),
dict(type='TopdownAffine', input_size=codec['input_size']),
dict(type='GenerateTarget', target_type='heatmap', encoder=codec),
dict(type='PackPoseInputs')
]
val_pipeline = [
dict(type='LoadImage', file_client_args={{_base_.file_client_args}}),
dict(type='GetBBoxCenterScale', padding=0.8),
dict(type='TopdownAffine', input_size=codec['input_size']),
dict(type='PackPoseInputs')
]
# data loaders
train_dataloader = dict(
batch_size=8, #64
num_workers=2,
persistent_workers=True,
sampler=dict(type='DefaultSampler', shuffle=True),
dataset=dict(
type=dataset_type,
data_root=data_root,
data_mode=data_mode,
ann_file='annotations/train.json',
data_prefix=dict(img='images/'),
pipeline=train_pipeline,
))
val_dataloader = dict(
batch_size=32,
num_workers=2,
persistent_workers=True,
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False, round_up=False),
dataset=dict(
type=dataset_type,
data_root=data_root,
data_mode=data_mode,
ann_file='annotations/val.json',
data_prefix=dict(img='images/'),
test_mode=True,
pipeline=val_pipeline,
))
test_dataloader = val_dataloader
# evaluators
val_evaluator = [
dict(type='PCKAccuracy', thr=0.2),
dict(type='AUC'),
dict(type='EPE'),
]
test_evaluator = val_evaluator
(2) ann files:
"images": [
{
"height": 1920,
"width": 2560,
"id": 1,
"dir_name": "train",
"file_name": "ch02_20210212124211_001046.jpg"
},
{
"height": 1920,
"width": 2560,
"id": 2,
"dir_name": "train",
"file_name": "ch02_20210213143154_000524.jpg"
...
"image_id": 922,
"bbox": [
762,
631,
1415,
1288
],
"category_id": 1,
"id": 1545
},
{
"segmentation": [
0
],
"num_keypoints": 9,
"area": 1807656,
"iscrowd": 0,
"keypoints": [
715,
1125,
2,
0,
0,
0,
786,
1584,
2,
699,
1568,
2,
799,
1746,
2,
643,
1703,
2,
392,
623,
2,
299,
637,
2,
354,
595,
2,
343,
697,
2
],
...
(3)point file
dataset_info = dict(
dataset_name='chicken',
paper_info=dict(
author='yuan ',
title='chicken',
container='',
year='2022',
homepage='',
),
keypoint_info={
0:
dict(name='body_center', id=0, color=[51, 153, 255], type='', swap=''),
1:
dict(
name='body_tail',
id=1,
color=[51, 153, 255],
type='',
swap=''),
2:
dict(
name='body_knee_left',
id=2,
color=[51, 153, 255],
type='',
swap=''),
3:
dict(
name='body_knee_right',
id=3,
color=[51, 153, 255],
type='',
swap=''),
4:
dict(
name='body_heel_left',
id=4,
color=[51, 153, 255],
type='',
swap=''),
5:
dict(
name='body_heel_right',
id=5,
color=[0, 255, 0],
type='',
swap=''),
6:
dict(
name='eye_left',
id=6,
color=[255, 128, 0],
type='',
swap=''),
7:
dict(
name='eye_right',
id=7,
color=[0, 255, 0],
type='',
swap=''),
8:
dict(
name='comb',
id=8,
color=[255, 128, 0],
type='',
swap=''),
9:
dict(
name='beak',
id=9,
color=[0, 255, 0],
type='',
swap=''),
},
skeleton_info={
0:
dict(link=('body_center', 'body_tail'), id=0, color=[0, 255, 0]),
1:
dict(link=('body_center', 'body_knee_left'), id=1, color=[0, 255, 0]),
2:
dict(link=('body_center', 'body_knee_right'), id=2, color=[255, 128, 0]),
3:
dict(link=('body_center', 'eye_left'), id=3, color=[255, 128, 0]),
4:
dict(link=('body_knee_left', 'body_heel_left'), id=4, color=[51, 153, 255]),
5:
dict(link=('body_knee_right', 'body_heel_right'), id=5, color=[0, 255, 0]),
6:
dict(
link=('eye_left', 'comb'), id=6, color=[255, 128, 0]),
7:
dict(link=('eye_left', 'beak'), id=7, color=[0, 255, 0]),
},
joint_weights=[
1., 1., 1., 1., 1., 1., 1., 1., 1.2, 1.2
],
sigmas=[
0.026, 0.025, 0.025, 0.035, 0.035, 0.029, 0.029, 0.072, 0.079, 0.079
]
)
Maybe (1) my image is too big? 2560 * 1920? (2) in config file, input_size = (160,160) is not suitable?
Anthing will be appreciated, Thanks.
mmpose-master works for training my own dataset. SO, something wrong with mmpose-1.x?
Hi, sorry for the late reply. There are updates in mmpose-1.x as well as upstream codebases like mmcv and mmengine in the past months. So would you like to install the latest mmpose-1.x and try it again? To install the latest mmpose-1.x, you would need to upgrade mmcv/mmengine to the latest version, and build mmpose from the dev-1.x branch.
We will appreciate it a lot if you can send us more feedback on mmpose-1.x.