mmpose icon indicating copy to clipboard operation
mmpose copied to clipboard

Concatenating multiple datasets in MMPOSE?

Open YuktiADY opened this issue 3 years ago • 9 comments
trafficstars

Hallo,

I would like to ask that is it possible to concatenate two datasets and train them in MMPOSE ?

Thanks,

YuktiADY avatar Mar 29 '22 14:03 YuktiADY

Please refer to #1256

ly015 avatar Mar 29 '22 14:03 ly015

Please refer to #1256

In the build_dataset function variable 'c' is used but its not defined anywhere ?? Because of this I am facing an error while training the model.

YuktiADY avatar Mar 30 '22 14:03 YuktiADY

Which c do you mean? Could you please provide detailed error information?

ly015 avatar Mar 30 '22 15:03 ly015

Which c do you mean? Could you please provide detailed error information?

if isinstance(cfg, (list, tuple)): dataset = ConcatDataset([build_dataset(c, default_args) for c in cfg]) Here

I am trying to train the coco dataset and my custom dataset (concatenating both ) so, I did changes in the config as well in builder.py added the code( the code you suggested added that snippet) I am getting the below error.

File "./mmpose/tools/train.py", line 170, in main() File "./mmpose/tools/train.py", line 145, in main datasets = [build_dataset(cfg.data.train)] File "/home/yukti/mmpose/mmpose/mmpose/datasets/builder.py", line 77, in build_dataset dataset = ConcatDataset([build_dataset(c, default_args) for c in cfg]) File "/home/yukti/mmpose/mmpose/mmpose/datasets/builder.py", line 77, in dataset = ConcatDataset([build_dataset(c, default_args) for c in cfg]) File "/home/yukti/mmpose/mmpose/mmpose/datasets/builder.py", line 87, in build_dataset dataset = build_from_cfg(cfg, DATASETS, default_args) File "/home/yukti/mmpose/mmcv/mmcv/utils/registry.py", line 55, in build_from_cfg raise type(e)(f'{obj_cls.name}: {e}') AssertionError: TheodorePlusV2Dataset:

Changes i did in config is added the below . base = ['/home/yukti/mmpose/mmpose/configs/base/datasets/theodore.py'] base = ['/home/yukti/mmpose/mmpose/configs/base/datasets/coco_wholebody.py']

dataset_type = 'TheodorePlusV2Dataset' train=[ dict(type=dataset_type,
#ann_file=f'{data_root}/annotations/coco_wholebody_train_v1.0.json', ann_file=f'{data_root}/coco_annotations/person_keypoints_train.json', #img_prefix=f'{data_root}/train2017/', img_prefix=f'{data_root}/train/img_png/', data_cfg=data_cfg, pipeline=train_pipeline, dataset_info={{base.dataset_info}}), dict(type='TopDownCocoWholeBodyDataset', ann_file=f'/mnt/data/yjin/coco/annotations/person_keypoints_train2017.json', img_prefix=f'mnt/data/yjin/coco/images/train2017/', data_cfg=data_cfg, pipeline=train_pipeline, dataset_info={{base.dataset_info}}), ],

YuktiADY avatar Mar 30 '22 15:03 YuktiADY

I didn't remember ever suggesting any modification to build.py. If you changed this file, could please provide your code? The code line you quoted does not seem likely to cause a variable-not-defined error.

I am afraid your modified config is invalid because it has two base fields which are both dataset_info. In this case, the fields from these two dataset_info files will cause conflicts, like one overriding another or unexpected merging. You can print out the loaded config and check its content.

ly015 avatar Mar 30 '22 16:03 ly015

I changed in builder.py. Please find the code below:

def _concat_dataset(cfg, default_args=None): types = cfg['type'] ann_files = cfg['ann_file'] img_prefixes = cfg.get('img_prefix', None) dataset_infos = cfg.get('dataset_info', None)

 num_joints = cfg['data_cfg'].get('num_joints', None)
 dataset_channel = cfg['data_cfg'].get('dataset_channel', None)

 datasets = []
 num_dset = len(ann_files)
 for i in range(num_dset):
    cfg_copy = copy.deepcopy(cfg)
    cfg_copy['ann_file'] = ann_files[i]

    if isinstance(types, (list, tuple)):
        cfg_copy['type'] = types[i]
    if isinstance(img_prefixes, (list, tuple)):
        cfg_copy['img_prefix'] = img_prefixes[i]
    if isinstance(dataset_infos, (list, tuple)):
        cfg_copy['dataset_info'] = dataset_infos[i]

    if isinstance(num_joints, (list, tuple)):
        cfg_copy['data_cfg']['num_joints'] = num_joints[i]

    if is_seq_of(dataset_channel, list):
        cfg_copy['data_cfg']['dataset_channel'] = dataset_channel[i]

    datasets.append(build_dataset(cfg_copy, default_args))

 return ConcatDataset(datasets)

def build_dataset(cfg, default_args=None): """Build a dataset from config dict.

Args:
    cfg (dict): Config dict. It should at least contain the key "type".
    default_args (dict, optional): Default initialization arguments.
        Default: None.

Returns:
    Dataset: The constructed dataset.
"""
from .dataset_wrappers import RepeatDataset

if isinstance(cfg, (list, tuple)):
    dataset = ConcatDataset([build_dataset(c, default_args) for c in cfg])
elif cfg['type'] == 'ConcatDataset':
 	dataset = ConcatDataset(
 	    [build_dataset(c, default_args) for c in cfg['datasets']])
elif cfg['type'] == 'RepeatDataset':
    dataset = RepeatDataset(
        build_dataset(cfg['dataset'], default_args), cfg['times'])
elif isinstance(cfg.get('ann_file'), (list, tuple)):
	dataset = _concat_dataset(cfg, default_args)
else:
    dataset = build_from_cfg(cfg, DATASETS, default_args)
return dataset

I tried to give only one _base field but still shows another error. In the config i gave the COCO dataset as well so now how it where it will look for dataset.info for COCO dataset (bcoz i have given only one base field which is for my custom dataset). If you see num_images these are the total images in my custom dataset, that means COCO dataset images are not been concatenated .

I am getting the below:

=> load 158849 samples => num_images: 50000 => load 158849 samples loading annotations into memory... Done (t=5.46s) creating index... index created! Traceback (most recent call last): File "/home/yukti/mmpose/mmcv/mmcv/utils/registry.py", line 52, in build_from_cfg return obj_cls(**args) File "/home/yukti/mmpose/mmpose/mmpose/datasets/datasets/top_down/topdown_coco_wholebody_dataset.py", line 83, in init self.db = self._get_db() File "/home/yukti/mmpose/mmpose/mmpose/datasets/datasets/top_down/topdown_coco_dataset.py", line 100, in _get_db gt_db = self._load_coco_keypoint_annotations() File "/home/yukti/mmpose/mmpose/mmpose/datasets/datasets/top_down/topdown_coco_dataset.py", line 110, in _load_coco_keypoint_annotations gt_db.extend(self._load_coco_keypoint_annotation_kernel(img_id)) File "/home/yukti/mmpose/mmpose/mmpose/datasets/datasets/top_down/topdown_coco_wholebody_dataset.py", line 132, in _load_coco_keypoint_annotation_kernel obj['face_kpts'] + obj['lefthand_kpts'] + KeyError: 'foot_kpts'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "./mmpose/tools/train.py", line 170, in main() File "./mmpose/tools/train.py", line 145, in main datasets = [build_dataset(cfg.data.train)] File "/home/yukti/mmpose/mmpose/mmpose/datasets/builder.py", line 77, in build_dataset dataset = ConcatDataset([build_dataset(c, default_args) for c in cfg]) File "/home/yukti/mmpose/mmpose/mmpose/datasets/builder.py", line 77, in dataset = ConcatDataset([build_dataset(c, default_args) for c in cfg]) File "/home/yukti/mmpose/mmpose/mmpose/datasets/builder.py", line 87, in build_dataset dataset = build_from_cfg(cfg, DATASETS, default_args) File "/home/yukti/mmpose/mmcv/mmcv/utils/registry.py", line 55, in build_from_cfg raise type(e)(f'{obj_cls.name}: {e}') KeyError: "TopDownCocoWholeBodyDataset: 'foot_kpts'"

Also Please suggest if i am thinking in wrong direction and changes I did in config and builder.py are correct or not ? Bcoz my main goal to concatenate the COCO and my custom dataset and to train them.

YuktiADY avatar Mar 30 '22 16:03 YuktiADY

The code you provided seems exactly the same as the code in builder.py in the master. Did you make any modifications to your local code with which you met the error?

As for the config, could you please provide the full content of your config file so it would be easier to locate the problem? From the error information above, it seems that you use the TopDownCocoWholeBodyDataset class to load your own data, but the field 'foot_kpts', which is needed by the dataset class, is not found in your annotation. It indicates that the keypoint definition or data structure in your data is different from it in the COCO Wholebody dataset.

In general, ConcatDataset is for combining multiple datasets with the same annotation format (can be loaded by the same dataset class), or at least with the same format of the pre-processed data sample (different dataset class and/or pipeline, but the DatasetClass.getitem() returns the same data structure). You can double-check if your data meets the requirement..

ly015 avatar Mar 30 '22 17:03 ly015

I got it wasnt the TopDownWholeBodydataset, but ts the TopDownCoco class used to load the data. After concatenating it will load the number of images of my dataset and as well as COCO dataset ?? Does mixing the two datasets improves the performance of the model ??

YuktiADY avatar Mar 31 '22 14:03 YuktiADY

I trained the hrnet smaller resolution model for 15 epochs and now i want to train this for another 10 epochs. So in my config i gave resume_from = '/home/yukti/mmpose/theodore_2022-04-22/best_AP_epoch_15.pth' ( checkpoint from last training ) but the training doesnt resume or start rather it shows this message.

2022-04-25 15:36:49,579 - mmpose - INFO - workflow: [('train', 1)], max: 10 epochs 2022-04-25 15:36:49,579 - mmpose - INFO - Checkpoints will be saved to /home/yukti/mmpose/theodore_2022-04-25 by HardDiskBackend. INFO:torch.distributed.elastic.agent.server.api:[default] worker group successfully finished. Waiting 300 seconds for other agents to finish. INFO:torch.distributed.elastic.agent.server.api:Local worker group finished (SUCCEEDED). Waiting 300 seconds for other agents to finish /home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/elastic/utils/store.py:71: FutureWarning: This is an experimental API and will be changed in future. "This is an experimental API and will be changed in future.", FutureWarning INFO:torch.distributed.elastic.agent.server.api:Done waiting for other agents. Elapsed: 0.0005908012390136719 seconds {"name": "torchelastic.worker.status.SUCCEEDED", "source": "WORKER", "timestamp": 0, "metadata": {"run_id": "none", "global_rank": 0, "group_rank": 0, "worker_id": "87571", "role": "default", "hostname": "dst-toaster.etit.tu-chemnitz.de", "state": "SUCCEEDED", "total_run_time": 45, "rdzv_backend": "static", "raw_error": null, "metadata": "{"group_world_size": 1, "entry_point": "python", "local_rank": [0], "role_rank": [0], "role_world_size": [1]}", "agent_restarts": 0}} {"name": "torchelastic.worker.status.SUCCEEDED", "source": "AGENT", "timestamp": 0, "metadata": {"run_id": "none", "global_rank": null, "group_rank": 0, "worker_id": null, "role": "default", "hostname": "dst-toaster.etit.tu-chemnitz.de", "state": "SUCCEEDED", "total_run_time": 45, "rdzv_backend": "static", "raw_error": null, "metadata": "{"group_world_size": 1, "entry_point": "python"}", "agent_restarts": 0}}

This is neither an error nor warning . Please tell me how to proceed with this ?

YuktiADY avatar Apr 25 '22 13:04 YuktiADY