mmpose icon indicating copy to clipboard operation
mmpose copied to clipboard

[Bug] Replace the head of litehrnet with dsnt it enable to run

Open Ninetya opened this issue 1 year ago • 1 comments

Prerequisite

  • [X] I have searched Issues and Discussions but cannot get the expected help.
  • [X] The bug has not been fixed in the latest version(https://github.com/open-mmlab/mmpose).

Environment

OrderedDict([('sys.platform', 'win32'), ('Python', '3.9.12 (main, Apr 4 2022, 05:22:27) [MSC v.1916 64 bit (AMD64)]'), ('CUDA available', False), ('numpy_random_seed', 2147483648), ('MSVC', '用于 x64 的 Misoft (R) C/C++ 优化编译器 19.34.31937 版'), ('GCC', 'n/a'), ('PyTorch', '2.1.1+cpu'), ('PyTorch compiling details', 'PyTorch built with:\n - C++ Version: 199711\n - MSVC 192930151\n - Intel(R) Math Kerneary Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications\n - Intel(R) MKL-DNN v3.1.1 (Git Hash 64f6bcbcbab628e96f33a62c3e975f8535a7bde4)\n - OpenMP 2019\n - LAPACK is enabled (usually provided by MKL)\n - CPU capability usage: AVX512\n - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /bigobj /FS -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE /utf-8 /wd4624 /wd4068 /wd4067 /wd4267 /wd4661 /wd4717 /wd4244 /wd4804 /wd4273, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=OFF, TORCH_VERSION=2.1.1, USE_CUDA=0, USE_CUDNN=OFF, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF, \n'), ('TorchVision', '0.16.1+cpu'), ('OpenCV', '4.8.1'), ('MMEngine', '0.7.3'), ('MMPose', '1.2.0+6d10b2e')])

Package Version


absl-py 2.0.0 addict 2.4.0 alabaster 0.7.13 albumentations 1.3.1 appdirs 1.4.4 attrs 23.1.0 Babel 2.13.1 cachetools 5.3.2 certifi 2023.11.17 cffi 1.16.0 charset-normalizer 3.3.2 chumpy 0.70 cityscapesScripts 2.2.2 click 8.1.7 colorama 0.4.6 coloredlogs 15.0.1 commonmark 0.9.1 contourpy 1.2.0 coverage 7.3.2 cycler 0.12.1 Cython 3.0.3 docutils 0.18.1 e2cnn 0.2.3 easydict 1.7 exceptiongroup 1.2.0 fairscale 0.4.13 filelock 3.12.4 flake8 6.1.0 flatbuffers 23.5.26 fonttools 4.45.0 fsspec 2023.9.2 future 0.18.3 google-auth 2.23.4 google-auth-oauthlib 1.1.0 grpcio 1.59.3 humanfriendly 10.0 idna 3.4 imagecorruptions 1.1.2 imageio 2.31.5 imagesize 1.4.1 importlib-metadata 6.8.0 importlib-resources 6.1.1 iniconfig 2.0.0 interrogate 1.5.0 isort 4.3.21 Jinja2 3.1.2 joblib 1.3.2 json-tricks 3.17.3 kiwisolver 1.4.5 lazy_loader 0.3 Markdown 3.5.1 markdown-it-py 3.0.0 MarkupSafe 2.1.3 matplotlib 3.8.2 mccabe 0.7.0 mdurl 0.1.2 mmcv 2.1.0 mmengine 0.7.3 mmpose 1.2.0 mmrotate 1.0.0rc1 mpmath 1.3.0 munkres 1.1.4 networkx 3.1 numpy 1.26.0 oauthlib 3.2.2 onnx 1.15.0 onnxruntime 1.16.3 opencv-python 4.8.1.78 opencv-python-headless 4.8.1.78 packaging 23.2 pandas 2.1.1 parameterized 0.9.0 pi 0.1.2 Pillow 10.0.1 pip 23.3.1 platformdirs 4.0.0 pluggy 1.3.0 protobuf 4.23.4 py 1.11.0 pyasn1 0.5.1 pyasn1-modules 0.3.0 pycocotools 2.0.7 pycodestyle 2.11.1 pycparser 2.21 pyflakes 3.1.0 Pygments 2.17.1 pyparsing 3.1.1 pyquaternion 0.9.9 pyreadline3 3.4.1 pytest 7.4.3 pytest-runner 6.0.0 python-dateutil 2.8.2 pytz 2023.3.post1 PyYAML 6.0.1 qudida 0.0.4 recommonmark 0.7.1 regex 2023.10.3 requests 2.31.0 requests-oauthlib 1.3.1 rich 13.7.0 rsa 4.9 scikit-image 0.22.0 scikit-learn 1.3.2 scipy 1.11.3 setuptools 60.2.0 shapely 2.0.2 six 1.16.0 smplx 0.1.28 snowballstemmer 2.2.0 Sphinx 7.2.6 sphinx-markdown-tables 0.0.17 sphinx-rtd-theme 1.3.0 sphinxcontrib-applehelp 1.0.7 sphinxcontrib-devhelp 1.0.5 sphinxcontrib-htmlhelp 2.0.4 sphinxcontrib-jquery 4.1 sphinxcontrib-jsmath 1.0.1 sphinxcontrib-qthelp 1.0.6 sphinxcontrib-serializinghtml 1.1.9 sympy 1.12 tabulate 0.9.0 tensorboard 2.15.1 tensorboard-data-server 0.7.2 tensorboardX 2.6.2.2 termcolor 2.3.0 terminaltables 3.1.10 threadpoolctl 3.2.0 tifffile 2023.9.26 titlecase 2.4.1 toml 0.10.2 tomli 2.0.1 torch 2.1.1 torchvision 0.16.1 tqdm 4.66.1 typing 3.7.4.3 typing_extensions 4.8.0 tzdata 2023.3 urllib3 2.1.0 Werkzeug 3.0.1 wheel 0.37.1 xdoctest 1.1.2 xtcocotools 1.14.3 yacs 0.1.8 yapf 0.40.1 zipp 3.17.0

Reproduces the problem - code sample

base = ['../../../base/default_runtime.py']

runtime

train_cfg = dict(max_epochs=210, val_interval=10)

optimizer

optim_wrapper = dict(optimizer=dict( type='Adam', lr=5e-4, ))

learning policy

param_scheduler = [ dict( type='LinearLR', begin=0, end=500, start_factor=0.001, by_epoch=False), # warm-up dict( type='MultiStepLR', begin=0, end=210, milestones=[170, 200], gamma=0.1, by_epoch=True) ]

automatically scaling LR based on the actual training batch size

auto_scale_lr = dict(base_batch_size=512)

hooks

default_hooks = dict(checkpoint=dict(save_best='coco/AP', rule='greater'))

codec settings

codec = dict( type='IntegralRegressionLabel', input_size=(256, 256), heatmap_size=(64, 64), sigma=2.0, normalize=True)

model settings

model = dict( type='TopdownPoseEstimator', data_preprocessor=dict( type='PoseDataPreprocessor', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], bgr_to_rgb=True), backbone=dict( type='LiteHRNet', in_channels=3, extra=dict( stem=dict(stem_channels=32, out_channels=32, expand_ratio=1), num_stages=3, stages_spec=dict( num_modules=(2, 4, 2), num_branches=(2, 3, 4), num_blocks=(2, 2, 2), module_type=('LITE', 'LITE', 'LITE'), with_fuse=(True, True, True), reduce_ratios=(8, 8, 8), num_channels=( (40, 80), (40, 80, 160), (40, 80, 160, 320), )), with_head=True, )), head=dict( type='DSNTHead', in_channels=40, in_featuremap_size=(8, 8), num_joints=17, loss=dict( type='MultipleLossWrapper', losses=[ dict(type='SmoothL1Loss', use_target_weight=True), dict(type='JSDiscretLoss', use_target_weight=True) ]), decoder=codec), test_cfg=dict( flip_test=True, shift_coords=True, shift_heatmap=True,

),
# init_cfg=dict(
#     type='Pretrained',
#     checkpoint='https://download.openmmlab.com/mmpose/'
#     'pretrain_models/td-hm_res50_8xb64-210e_coco-256x192.pth')

)

base dataset settings

dataset_type = 'CocoDataset' data_mode = 'topdown' data_root = 'data/coco/'

pipelines

train_pipeline = [ dict(type='LoadImage'), dict(type='GetBBoxCenterScale'), dict(type='RandomFlip', direction='horizontal'), dict(type='RandomHalfBody'), dict(type='RandomBBoxTransform'), dict(type='TopdownAffine', input_size=codec['input_size']), dict(type='GenerateTarget', encoder=codec), dict(type='PackPoseInputs') ] test_pipeline = [ dict(type='LoadImage'), dict(type='GetBBoxCenterScale'), dict(type='TopdownAffine', input_size=codec['input_size']), dict(type='PackPoseInputs') ]

data loaders

train_dataloader = dict( batch_size=16, num_workers=2, persistent_workers=True, sampler=dict(type='DefaultSampler', shuffle=True), dataset=dict( type=dataset_type, data_root=data_root, data_mode=data_mode, ann_file='annotations/person_keypoints_train2017.json', data_prefix=dict(img='train2017/'), pipeline=train_pipeline, )) val_dataloader = dict( batch_size=32, num_workers=2, persistent_workers=True, drop_last=False, sampler=dict(type='DefaultSampler', shuffle=False, round_up=False), dataset=dict( type=dataset_type, data_root=data_root, data_mode=data_mode, ann_file='annotations/person_keypoints_val2017.json', bbox_file=f'{data_root}person_detection_results/' 'COCO_val2017_detections_AP_H_56_person.json', data_prefix=dict(img='val2017/'), test_mode=True, pipeline=test_pipeline, )) test_dataloader = val_dataloader

hooks

default_hooks = dict(checkpoint=dict(save_best='coco/AP', rule='greater'))

evaluators

val_evaluator = dict( type='CocoMetric', ann_file=f'{data_root}annotations/person_keypoints_val2017.json') test_evaluator = val_evaluator

Reproduces the problem - command or script

python tools/train.py D:\mmpose\configs\body_2d_keypoint\integral_regression\coco\ipr_litehrnet-18_dsnt-8xb64-210e_coco-256x192.py

Reproduces the problem - error message

Traceback (most recent call last): File "D:\mmpose\tools\train.py", line 162, in main() File "D:\mmpose\tools\train.py", line 158, in main runner.train() File "C:\pythonProject2\lib\site-packages\mmengine\runner\runner.py", line 1721, in train model = self.train_loop.run() # type: ignore File "C:\pythonProject2\lib\site-packages\mmengine\runner\loops.py", line 96, in run self.run_epoch() File "C:\pythonProject2\lib\site-packages\mmengine\runner\loops.py", line 112, in run_epoch self.run_iter(idx, data_batch) File "C:\pythonProject2\lib\site-packages\mmengine\runner\loops.py", line 128, in run_iter outputs = self.runner.model.train_step( File "C:\pythonProject2\lib\site-packages\mmengine\model\base_model\base_model.py", line 114, in train_step losses = self._run_forward(data, mode='loss') # type: ignore File "C:\pythonProject2\lib\site-packages\mmengine\model\base_model\base_model.py", line 340, in _run_forward results = self(**data, mode=mode) File "C:\pythonProject2\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "C:\pythonProject2\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "C:\pythonProject2\lib\site-packages\mmpose\models\pose_estimators\base.py", line 145, in forward return self.loss(inputs, data_samples) File "C:\pythonProject2\lib\site-packages\mmpose\models\pose_estimators\topdown.py", line 74, in loss self.head.loss(feats, data_samples, train_cfg=self.train_cfg)) File "C:\pythonProject2\lib\site-packages\mmpose\models\heads\regression_heads\dsnt_head.py", line 109, in loss pred_coords, pred_heatmaps = self.forward(inputs) File "C:\pythonProject2\lib\site-packages\mmpose\models\heads\regression_heads\integral_regression_head.py", line 190, in forward pred_x = self._linear_expectation(heatmaps, self.linspace_x) File "C:\pythonProject2\lib\site-packages\mmpose\models\heads\regression_heads\integral_regression_head.py", line 156, in _linear_expectation heatmaps = heatmaps.mul(linspace).reshape(B, N, -1) RuntimeError: The size of tensor a (512) must match the size of tensor b (64) at non-singleton dimension 3

Additional information

No response

Ninetya avatar Dec 01 '23 14:12 Ninetya

Please check whether the argument in_featuremap_size is compatible with the output feature maps from litehrnet

Ben-Louis avatar Dec 04 '23 02:12 Ben-Louis