mmpose How Do I used a COCO Annotated Custom Keypoint Dataset for Custom Objects in MMPose

Hi,

I have annotated a custom dataset in COCO Format with keypoints. It's a square object where the keypoints are the corners. My simple question is how do I use this annotated data directly with MMPose in the simplest way as it is already in COCO Format?

I have been following the tutorial notebook and here are the things I have tried so far after running into many errors:

Assuming that my dataset would be directly compatible I just put it in a similar structure to what the tutorial puts its COCOTinyDataset in but I didn't run the cell where it registers that dataset instead I just ran the configuration cell like this where I wrote the type of the dataset as TopDownCocoDataset which is already registered.

from mmcv import Config
cfg = Config.fromfile(
    './configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192.py'
)

# set basic configs
cfg.data_root = 'data/myObjs'
cfg.work_dir = 'work_dirs/hrnet_w32_coco_tiny_256x192'
cfg.gpu_ids = range(1)
cfg.seed = 0

# set log interval
cfg.log_config.interval = 1

# set evaluation configs
cfg.evaluation.interval = 10
cfg.evaluation.metric = 'PCK'
cfg.evaluation.save_best = 'PCK'

# set learning rate policy
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=10,
    warmup_ratio=0.001,
    step=[17, 35])
cfg.total_epochs = 40

# set batch size
cfg.data.samples_per_gpu = 16
cfg.data.val_dataloader = dict(samples_per_gpu=16)
cfg.data.test_dataloader = dict(samples_per_gpu=16)


# set dataset configs
cfg.data.train.type = 'TopDownCocoDataset'
cfg.data.train.ann_file = f'{cfg.data_root}/test.json'
cfg.data.train.img_prefix = f'{cfg.data_root}/images/'

cfg.data.val.type = 'TopDownCocoDataset'
cfg.data.val.ann_file = f'{cfg.data_root}/test.json'
cfg.data.val.img_prefix = f'{cfg.data_root}/images/'

cfg.data.test.type = 'TopDownCocoDataset'
cfg.data.test.ann_file = f'{cfg.data_root}/test.json'
cfg.data.test.img_prefix = f'{cfg.data_root}/images/'


print(cfg.pretty_text)

The print from the above cell would still output a lot of information regarding human pose estimation where that was not desired and since the training didn't begin I thought it was at fault. So I traced it to mmpose/configs/_base_/datasets/coco.py. Here I edited all the information to match my scenario.

dataset_info = dict(
    dataset_name='coco',
    paper_info=dict(
        author='Lin, Tsung-Yi and Maire, Michael and '
        'Belongie, Serge and Hays, James and '
        'Perona, Pietro and Ramanan, Deva and '
        r'Doll{\'a}r, Piotr and Zitnick, C Lawrence',
        title='Microsoft coco: Common objects in context',
        container='European conference on computer vision',
        year='2014',
        homepage='http://cocodataset.org/',
    ),
    keypoint_info={
        0:
        dict(name='top_left', id=0, color=[51, 153, 255], type='upper', swap='top_right'),
        1:
        dict(
            name='top_right',
            id=1,
            color=[51, 153, 255],
            type='upper',
            swap='top_left'),
        2:
        dict(
            name='bottom_left',
            id=2,
            color=[51, 153, 255],
            type='upper',
            swap='bottom_right'),
        3:
        dict(
            name='bottom_right',
            id=3,
            color=[51, 153, 255],
            type='upper',
            swap='bottom_left')
    },
    skeleton_info={
        0:
        dict(link=('top_left', 'top_right'), id=0, color=[0, 255, 0]),
        1:
        dict(link=('top_left', 'bottom_left'), id=1, color=[0, 255, 0]),
        2:
        dict(link=('top_right', 'bottom_right'), id=2, color=[255, 128, 0]),
        3:
        dict(link=('bottom_left', 'bottom_right'), id=3, color=[255, 128, 0])
    },
    joint_weights=[
        1., 1., 1., 1.
    ],
    sigmas=[
        0.026, 0.025, 0.025, 0.035
    ])

So I finally see some output when I run the training cell but it gives an error again about this file data/coco/person_detection_results/COCO_val2017_detections_AP_H_56_person.json not being present. I now remember I saw it in the MD file which shows how to use a dataset of your own but in my scenario, I don't have this file for my dataset but I still download it anyway and run but the training doesn't progress and throws this error and stops

Then I see that this file is used in the config file on some variable called box_file, I tried to comment that out and that didn't work either.

So basically how can I just begin training for my use case and my dataset just by simply giving paths to my COCO formatted dataset which has all the information regarding skeletons, keypoints, segmentation, bounding_boxes, etc?

I saw #981 but in that issue, the solution the OP has mentioned is very vague for me in the first step. He says to create a package and then create a class definition but what exactly do I write in there for my dataset to work and If I even need to do that since my dataset is already formatted in COCO.

Thank you so much for reading the lost post. I might be missing something fundamental here but I would really appreciate the help.

Thanks

P.S. The reason I have one JSON file for the data named test.json and I pass the same for all three sets is that right now I am just trying to initialize the training process right as more data is being collected and labeled.

Nov 25 '21 14:11 AliButtar

The problem is about data/coco/person_detection_results/COCO_val2017_detections_AP_H_56_person.json, this file is used for evaluation only. It provides detected bounding boxes, which can be used for keypoint evaluation. Since you do not have this, you may choose to use ground-truth bounding boxes for evaluation.

https://github.com/open-mmlab/mmpose/blob/4963fc4d46cf34a6e314a9f8ff6d1be241b042c5/configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192.py#L99

Here, you can set use_gt_bbox=True.

Nov 26 '21 01:11 jin-s13

Thank you for the feedback. The suggestion you mentioned worked and the model trained but I also had to comment out these two lines from the code above as it gave an error about PCK metric:

cfg.evaluation.metric = 'PCK'
cfg.evaluation.save_best = 'PCK'

Other than this I would like to ask if the process I am using to train the model on my own custom data is right? If you can confirm this then it would be great.

Also, the mmPose installation on the tutorial notebook seems to have some dependency issues. It is not running code for inference and gives this error

These are the current version Colab Installs

torch version: 1.10.0+cu111 True
torchvision version: 0.11.1+cu111
mmpose version: 0.20.0

I also tried torch==1.9.0+cu111 torchvision==0.10.0+cu111 but it didn't work still gave this error. This inference error is also present in one of the top cells where the notebook only does inference from a pretrained model before moving into training. Can you guide me on how to resolve this?

Thanks a lot.

Nov 26 '21 10:11 AliButtar

One more question is that seeing the inference code in the tutorial notebook. Am I correctly assuming that first a detection model is being used to detect a person and then a pose model is used to perform inference for the pose.

So if have to correctly use this framework on my problem then I would have to first train a detector that detects my object in the photo and then detects the keypoints on that object?

Nov 26 '21 11:11 AliButtar

I also had to comment out these two lines from the code above as it gave an error about PCK metric: Yes, for coco dataset, AP is used for evaluation.
Other than this I would like to ask if the process I am using to train the model on my own custom data is right? I did not check very carefully, but it looks good. You may check if the loss decreases, and the accuracy increases.
the mmPose installation on the tutorial notebook seems to have some dependency issues. @ly015 Could you please help check this problem? It seems that mmcv-full is not installed properly. Maybe you can uninstall and re-install the latest mmcv-full.
So if have to correctly use this framework on my problem then I would have to first train a detector that detects my object in the photo and then detects the keypoints on that object? For top-down algorithms, yes. You may use mmdetection to train a detector. You can also try bottom-up approaches, e.g. Associative Embedding (AE), which does not rely on object detection.

Nov 26 '21 11:11 jin-s13

Understood
Alright, Ill check that.
Sorry, I know you tagged your colleague there but are you referring me to uninstall and reinstall mmcv-full? If yes then so far I have only tried this on colab notebooks where it takes almost 30 minutes to install. I'll try to check.
I see, then would my COCO dataset be compatible with these bottom-up approaches? Ill check this myself but just asking if I should be aware of any other changes here.

Thanks again.

Nov 26 '21 11:11 AliButtar

are you referring me to uninstall and reinstall mmcv-full? Yes, please try again.

would my COCO dataset be compatible with these bottom-up approaches? Yes, please have a try.

Nov 26 '21 11:11 jin-s13

I have tried the bottom-up approaches but unfortunately I just get the following error:

I used the following file: /content/mmpose/configs/body/2d_kpt_sview_rgb_img/associative_embedding/coco/higherhrnet_w32_coco_512x512.py

I made appropriate changes to this file. I changed the num_joints from 17 to 4 where they were being used. This is a change I made in TopDown config file as well to make it work. I am assuming that there is some parameter configuration I am missing out on here.

_base_ = ['../../../../_base_/datasets/coco.py']
log_level = 'INFO'
load_from = None
resume_from = None
dist_params = dict(backend='nccl')
workflow = [('train', 1)]
checkpoint_config = dict(interval=50)
evaluation = dict(interval=50, metric='mAP', save_best='AP')

optimizer = dict(
    type='Adam',
    lr=0.0015,
)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=0.001,
    step=[200, 260])
total_epochs = 300
log_config = dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
    ])

channel_cfg = dict(
    dataset_joints=4,
    dataset_channel=[
        [0, 1, 2, 3],
    ],
    inference_channel=[
        0, 1, 2, 3
    ])

data_cfg = dict(
    image_size=512,
    base_size=256,
    base_sigma=2,
    heatmap_size=[128, 256],
    num_joints=channel_cfg['dataset_joints'],
    dataset_channel=channel_cfg['dataset_channel'],
    inference_channel=channel_cfg['inference_channel'],
    num_scales=2,
    scale_aware_sigma=False,
)

# model settings
model = dict(
    type='AssociativeEmbedding',
    pretrained='https://download.openmmlab.com/mmpose/'
    'pretrain_models/hrnet_w32-36af842e.pth',
    backbone=dict(
        type='HRNet',
        in_channels=3,
        extra=dict(
            stage1=dict(
                num_modules=1,
                num_branches=1,
                block='BOTTLENECK',
                num_blocks=(4, ),
                num_channels=(64, )),
            stage2=dict(
                num_modules=1,
                num_branches=2,
                block='BASIC',
                num_blocks=(4, 4),
                num_channels=(32, 64)),
            stage3=dict(
                num_modules=4,
                num_branches=3,
                block='BASIC',
                num_blocks=(4, 4, 4),
                num_channels=(32, 64, 128)),
            stage4=dict(
                num_modules=3,
                num_branches=4,
                block='BASIC',
                num_blocks=(4, 4, 4, 4),
                num_channels=(32, 64, 128, 256))),
    ),
    keypoint_head=dict(
        type='AEHigherResolutionHead',
        in_channels=32,
        num_joints=4,
        tag_per_joint=True,
        extra=dict(final_conv_kernel=1, ),
        num_deconv_layers=1,
        num_deconv_filters=[32],
        num_deconv_kernels=[4],
        num_basic_blocks=4,
        cat_output=[True],
        with_ae_loss=[True, False],
        loss_keypoint=dict(
            type='MultiLossFactory',
            num_joints=4,
            num_stages=2,
            ae_loss_type='exp',
            with_ae_loss=[True, False],
            push_loss_factor=[0.001, 0.001],
            pull_loss_factor=[0.001, 0.001],
            with_heatmaps_loss=[True, True],
            heatmaps_loss_factor=[1.0, 1.0])),
    train_cfg=dict(),
    test_cfg=dict(
        num_joints=channel_cfg['dataset_joints'],
        max_num_people=30,
        scale_factor=[1],
        with_heatmaps=[True, True],
        with_ae=[True, False],
        project2image=True,
        align_corners=False,
        nms_kernel=5,
        nms_padding=2,
        tag_per_joint=True,
        detection_threshold=0.1,
        tag_threshold=1,
        use_detection_val=True,
        ignore_too_much=False,
        adjust=True,
        refine=True,
        flip_test=True))

train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='BottomUpRandomAffine',
        rot_factor=30,
        scale_factor=[0.75, 1.5],
        scale_type='short',
        trans_factor=40),
    dict(type='BottomUpRandomFlip', flip_prob=0.5),
    dict(type='ToTensor'),
    dict(
        type='NormalizeTensor',
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]),
    dict(
        type='BottomUpGenerateTarget',
        sigma=2,
        max_num_people=30,
    ),
    dict(
        type='Collect',
        keys=['img', 'joints', 'targets', 'masks'],
        meta_keys=[]),
]

val_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='BottomUpGetImgSize', test_scale_factor=[1]),
    dict(
        type='BottomUpResizeAlign',
        transforms=[
            dict(type='ToTensor'),
            dict(
                type='NormalizeTensor',
                mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225]),
        ]),
    dict(
        type='Collect',
        keys=['img'],
        meta_keys=[
            'image_file', 'aug_data', 'test_scale_factor', 'base_size',
            'center', 'scale', 'flip_index'
        ]),
]

test_pipeline = val_pipeline

data_root = 'data/coco'
data = dict(
    workers_per_gpu=2,
    train_dataloader=dict(samples_per_gpu=24),
    val_dataloader=dict(samples_per_gpu=1),
    test_dataloader=dict(samples_per_gpu=1),
    train=dict(
        type='BottomUpCocoDataset',
        ann_file=f'{data_root}/annotations/person_keypoints_train2017.json',
        img_prefix=f'{data_root}/train2017/',
        data_cfg=data_cfg,
        pipeline=train_pipeline,
        dataset_info={{_base_.dataset_info}}),
    val=dict(
        type='BottomUpCocoDataset',
        ann_file=f'{data_root}/annotations/person_keypoints_val2017.json',
        img_prefix=f'{data_root}/val2017/',
        data_cfg=data_cfg,
        pipeline=val_pipeline,
        dataset_info={{_base_.dataset_info}}),
    test=dict(
        type='BottomUpCocoDataset',
        ann_file=f'{data_root}/annotations/person_keypoints_val2017.json',
        img_prefix=f'{data_root}/val2017/',
        data_cfg=data_cfg,
        pipeline=test_pipeline,
        dataset_info={{_base_.dataset_info}}),
)

Please have a look and let me know what I might be not be changing appropriately for my dataset.

Nov 26 '21 12:11 AliButtar

_base_ = ['../../../../_base_/datasets/coco.py'] This should be replaced with your own 4-kpt dataset

Nov 26 '21 12:11 jin-s13

Apologies, my bad. It worked I was loading the wrong configuration file, higherhrnet instead of hrnet where I was making the changes.

The training started but I am now getting this error for it when it tries to save the model.

AssertionError                            Traceback (most recent call last)
<ipython-input-18-1b2b2ac6a433> in <module>()
     15 # train model
     16 train_model(
---> 17     model, datasets, cfg, distributed=False, validate=True, meta=dict())

13 frames
/content/mmpose/mmpose/apis/train.py in train_model(model, dataset, cfg, distributed, validate, timestamp, meta)
    154     elif cfg.load_from:
    155         runner.load_checkpoint(cfg.load_from)
--> 156     runner.run(data_loaders, cfg.workflow, cfg.total_epochs)

/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py in run(self, data_loaders, workflow, max_epochs, **kwargs)
    125                     if mode == 'train' and self.epoch >= self._max_epochs:
    126                         break
--> 127                     epoch_runner(data_loaders[i], **kwargs)
    128 
    129         time.sleep(1)  # wait for some hooks like loggers to finish

/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py in train(self, data_loader, **kwargs)
     52             self._iter += 1
     53 
---> 54         self.call_hook('after_train_epoch')
     55         self._epoch += 1
     56 

/usr/local/lib/python3.7/dist-packages/mmcv/runner/base_runner.py in call_hook(self, fn_name)
    305         """
    306         for hook in self._hooks:
--> 307             getattr(hook, fn_name)(self)
    308 
    309     def get_hook_info(self):

/usr/local/lib/python3.7/dist-packages/mmcv/runner/hooks/evaluation.py in after_train_epoch(self, runner)
    265         """Called after every training epoch to evaluate the results."""
    266         if self.by_epoch and self._should_evaluate(runner):
--> 267             self._do_evaluate(runner)
    268 
    269     def _do_evaluate(self, runner):

/usr/local/lib/python3.7/dist-packages/mmcv/runner/hooks/evaluation.py in _do_evaluate(self, runner)
    269     def _do_evaluate(self, runner):
    270         """perform evaluation and save ckpt."""
--> 271         results = self.test_fn(runner.model, self.dataloader)
    272         runner.log_buffer.output['eval_iter_num'] = len(self.dataloader)
    273         key_score = self.evaluate(runner, results)

/content/mmpose/mmpose/apis/test.py in single_gpu_test(model, data_loader)
     31     for data in data_loader:
     32         with torch.no_grad():
---> 33             result = model(return_loss=False, **data)
     34         results.append(result)
     35 

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/mmcv/parallel/data_parallel.py in forward(self, *inputs, **kwargs)
     48             return self.module(*inputs[0], **kwargs[0])
     49         else:
---> 50             return super().forward(*inputs, **kwargs)
     51 
     52     def scatter(self, inputs, kwargs, device_ids):

/usr/local/lib/python3.7/dist-packages/torch/nn/parallel/data_parallel.py in forward(self, *inputs, **kwargs)
    164 
    165             if len(self.device_ids) == 1:
--> 166                 return self.module(*inputs[0], **kwargs[0])
    167             replicas = self.replicate(self.module, self.device_ids[:len(inputs)])
    168             outputs = self.parallel_apply(replicas, inputs, kwargs)

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/mmcv/runner/fp16_utils.py in new_func(*args, **kwargs)
     96                                 'method of nn.Module')
     97             if not (hasattr(args[0], 'fp16_enabled') and args[0].fp16_enabled):
---> 98                 return old_func(*args, **kwargs)
     99 
    100             # get the arg spec of the decorated method

/content/mmpose/mmpose/models/detectors/associative_embedding.py in forward(self, img, targets, masks, joints, img_metas, return_loss, return_heatmap, **kwargs)
    131                                       **kwargs)
    132         return self.forward_test(
--> 133             img, img_metas, return_heatmap=return_heatmap, **kwargs)
    134 
    135     def forward_train(self, img, targets, masks, joints, img_metas, **kwargs):

/content/mmpose/mmpose/models/detectors/associative_embedding.py in forward_test(self, img, img_metas, return_heatmap, **kwargs)
    214             scale (np.ndarray): the scale of image
    215         """
--> 216         assert img.size(0) == 1
    217         assert len(img_metas) == 1
    218 

AssertionError:

Thanks

Nov 26 '21 13:11 AliButtar

val_dataloader=dict(samples_per_gpu=1),

For bottom-up models, during evaluation, only batchsize=1 is supported.

Nov 26 '21 13:11 jin-s13

@jin-s13 Thanks a lot, the training ran completely without any errors after that change.

On another note, I tried reinstalling mmcv-full simply with pip uninstall mmcv-full and then by running pip install mmcv-full but the same error I mentioned earlier is still there.

Nov 26 '21 13:11 AliButtar

Hi @jin-s13 , just to make sure I understand it correctly, to train a model on a new dataset with, e.g., 4 key-points in COCO format, one can just 1) convert the data/annotation into coco format, 2) modify the data config file (https://github.com/open-mmlab/mmpose/blob/master/configs/body/2d_kpt_sview_rgb_img/associative_embedding/coco/hrnet_w32_coco_512x512_udp.py#L1) to support 4 key-points, and 3) modify the model output channel number? All the other training and evaluation code will then be compatible with the new dataset.

Dec 01 '21 05:12 haichaoyu

Also need to modify the dataset info file https://github.com/open-mmlab/mmpose/blob/master/configs/base/datasets/coco.py

Dec 08 '21 02:12 jin-s13

I have tried the bottom-up approaches but unfortunately I just get the following error:

I used the following file: /content/mmpose/configs/body/2d_kpt_sview_rgb_img/associative_embedding/coco/higherhrnet_w32_coco_512x512.py

I made appropriate changes to this file. I changed the num_joints from 17 to 4 where they were being used. This is a change I made in TopDown config file as well to make it work. I am assuming that there is some parameter configuration I am missing out on here.

_base_ = ['../../../../_base_/datasets/coco.py']
log_level = 'INFO'
load_from = None
resume_from = None
dist_params = dict(backend='nccl')
workflow = [('train', 1)]
checkpoint_config = dict(interval=50)
evaluation = dict(interval=50, metric='mAP', save_best='AP')

optimizer = dict(
    type='Adam',
    lr=0.0015,
)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=0.001,
    step=[200, 260])
total_epochs = 300
log_config = dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
    ])

channel_cfg = dict(
    dataset_joints=4,
    dataset_channel=[
        [0, 1, 2, 3],
    ],
    inference_channel=[
        0, 1, 2, 3
    ])

data_cfg = dict(
    image_size=512,
    base_size=256,
    base_sigma=2,
    heatmap_size=[128, 256],
    num_joints=channel_cfg['dataset_joints'],
    dataset_channel=channel_cfg['dataset_channel'],
    inference_channel=channel_cfg['inference_channel'],
    num_scales=2,
    scale_aware_sigma=False,
)

# model settings
model = dict(
    type='AssociativeEmbedding',
    pretrained='https://download.openmmlab.com/mmpose/'
    'pretrain_models/hrnet_w32-36af842e.pth',
    backbone=dict(
        type='HRNet',
        in_channels=3,
        extra=dict(
            stage1=dict(
                num_modules=1,
                num_branches=1,
                block='BOTTLENECK',
                num_blocks=(4, ),
                num_channels=(64, )),
            stage2=dict(
                num_modules=1,
                num_branches=2,
                block='BASIC',
                num_blocks=(4, 4),
                num_channels=(32, 64)),
            stage3=dict(
                num_modules=4,
                num_branches=3,
                block='BASIC',
                num_blocks=(4, 4, 4),
                num_channels=(32, 64, 128)),
            stage4=dict(
                num_modules=3,
                num_branches=4,
                block='BASIC',
                num_blocks=(4, 4, 4, 4),
                num_channels=(32, 64, 128, 256))),
    ),
    keypoint_head=dict(
        type='AEHigherResolutionHead',
        in_channels=32,
        num_joints=4,
        tag_per_joint=True,
        extra=dict(final_conv_kernel=1, ),
        num_deconv_layers=1,
        num_deconv_filters=[32],
        num_deconv_kernels=[4],
        num_basic_blocks=4,
        cat_output=[True],
        with_ae_loss=[True, False],
        loss_keypoint=dict(
            type='MultiLossFactory',
            num_joints=4,
            num_stages=2,
            ae_loss_type='exp',
            with_ae_loss=[True, False],
            push_loss_factor=[0.001, 0.001],
            pull_loss_factor=[0.001, 0.001],
            with_heatmaps_loss=[True, True],
            heatmaps_loss_factor=[1.0, 1.0])),
    train_cfg=dict(),
    test_cfg=dict(
        num_joints=channel_cfg['dataset_joints'],
        max_num_people=30,
        scale_factor=[1],
        with_heatmaps=[True, True],
        with_ae=[True, False],
        project2image=True,
        align_corners=False,
        nms_kernel=5,
        nms_padding=2,
        tag_per_joint=True,
        detection_threshold=0.1,
        tag_threshold=1,
        use_detection_val=True,
        ignore_too_much=False,
        adjust=True,
        refine=True,
        flip_test=True))

train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='BottomUpRandomAffine',
        rot_factor=30,
        scale_factor=[0.75, 1.5],
        scale_type='short',
        trans_factor=40),
    dict(type='BottomUpRandomFlip', flip_prob=0.5),
    dict(type='ToTensor'),
    dict(
        type='NormalizeTensor',
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]),
    dict(
        type='BottomUpGenerateTarget',
        sigma=2,
        max_num_people=30,
    ),
    dict(
        type='Collect',
        keys=['img', 'joints', 'targets', 'masks'],
        meta_keys=[]),
]

val_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='BottomUpGetImgSize', test_scale_factor=[1]),
    dict(
        type='BottomUpResizeAlign',
        transforms=[
            dict(type='ToTensor'),
            dict(
                type='NormalizeTensor',
                mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225]),
        ]),
    dict(
        type='Collect',
        keys=['img'],
        meta_keys=[
            'image_file', 'aug_data', 'test_scale_factor', 'base_size',
            'center', 'scale', 'flip_index'
        ]),
]

test_pipeline = val_pipeline

data_root = 'data/coco'
data = dict(
    workers_per_gpu=2,
    train_dataloader=dict(samples_per_gpu=24),
    val_dataloader=dict(samples_per_gpu=1),
    test_dataloader=dict(samples_per_gpu=1),
    train=dict(
        type='BottomUpCocoDataset',
        ann_file=f'{data_root}/annotations/person_keypoints_train2017.json',
        img_prefix=f'{data_root}/train2017/',
        data_cfg=data_cfg,
        pipeline=train_pipeline,
        dataset_info={{_base_.dataset_info}}),
    val=dict(
        type='BottomUpCocoDataset',
        ann_file=f'{data_root}/annotations/person_keypoints_val2017.json',
        img_prefix=f'{data_root}/val2017/',
        data_cfg=data_cfg,
        pipeline=val_pipeline,
        dataset_info={{_base_.dataset_info}}),
    test=dict(
        type='BottomUpCocoDataset',
        ann_file=f'{data_root}/annotations/person_keypoints_val2017.json',
        img_prefix=f'{data_root}/val2017/',
        data_cfg=data_cfg,
        pipeline=test_pipeline,
        dataset_info={{_base_.dataset_info}}),
)

Please have a look and let me know what I might be not be changing appropriately for my dataset.

I have the same error! I noticed that you may resolve this problem by modifying the base = ['../../../../base/datasets/coco.py']. And I also modify this .py to 6 key-points (my dataset has 6 key points).But the error also occurs.Is there any probelm?

my coco.py

dataset_info = dict( dataset_name='coco', paper_info=dict( author='Lin, Tsung-Yi and Maire, Michael and ' 'Belongie, Serge and Hays, James and ' 'Perona, Pietro and Ramanan, Deva and ' r'Doll{'a}r, Piotr and Zitnick, C Lawrence', title='Microsoft coco: Common objects in context', container='European conference on computer vision', year='2014', homepage='http://cocodataset.org/', ), keypoint_info={ 0: dict(name='left_top', id=1, color=[51, 153, 255], type='upper', swap='right_top'), 1: dict( name='right_top', id=2, color=[51, 153, 255], type='upper', swap='left_top'), 2: dict( name='right_bottom', id=3, color=[51, 153, 255], type='lower', swap='left_bottom'), 3: dict( name='left_bottom', id=4, color=[51, 153, 255], type='lower', swap='right_bottom'), 4: dict( name='center', id=5, color=[0, 255, 0], type='upper', swap=''), 5: dict( name='head', id=6, color=[0, 255, 0], type='upper', swap='')

},
skeleton_info={
    0:
    dict(link=('left_top', 'right_top'), id=0, color=[0, 255, 0]),
    1:
    dict(link=('right_top', 'right_bottom'), id=1, color=[0, 255, 0]),
    2:
    dict(link=('right_bottom', 'left_bottom'), id=2, color=[0, 255, 0]),
    3:
    dict(link=('left_bottom', 'left_top'), id=3, color=[0, 255, 0]),
    4:
    dict(link=('center', 'head'), id=4, color=[51, 153, 255]) 
},
joint_weights=[
    1., 1., 1., 1., 1., 1.
],
sigmas=[
    0.026, 0.025, 0.025, 0.035, 0.035, 0.035
])

Apr 10 '22 07:04 xypu98

during reading on the config of different model , i noticed that there are 4 parameters:

num_output_channels
dataset_joints
dataset_channel
inference_channel

i noticed that in the config of coco dataset (human keypoints) , those parameter are setting to 17 while for example for animal_pose it is 20 .

those changes are applied on the following part/code within the config:

channel_cfg = dict(
    num_output_channels=20,
    dataset_joints=20,
    dataset_channel=[
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
    ],
    inference_channel=[
        0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19
    ])

I think it is necessary to do such changing according to num of keypoints i have in my dataset ,

my questions are : first : shall i do them all for all types of models, for example top-down model structures as well as bottom-up models ? second : in case i form my dataset and annotation to be as coco dataset but the objects that i am creating the keypoints for it was not human. do all types of model work? or this could case an error?

Feb 15 '23 17:02 alaa-shubbak

mmpose mmpose copied to clipboard

How Do I used a COCO Annotated Custom Keypoint Dataset for Custom Objects in MMPose

mmpose
mmpose copied to clipboard