mmpose icon indicating copy to clipboard operation
mmpose copied to clipboard

rtmpose use coco training error

Open zhenhao-huang opened this issue 1 year ago • 0 comments

Prerequisite

  • [X] I have searched Issues and Discussions but cannot get the expected help.
  • [X] The bug has not been fixed in the latest version(https://github.com/open-mmlab/mmpose).

Environment

python 3.8.10 mmcv 2.1.0 mmpose 1.3.2

Reproduces the problem - code sample

None

Reproduces the problem - command or script

python tools/train.py projects/rtmpose/rtmpose/body_2d_keypoint/rtmpose-m_8xb256-420e_coco-256x192.py

Reproduces the problem - error message

08/17 16:08:51 - mmengine - INFO - load backbone. in model from: https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/cspnext-m_udp-aic-coco_210e-256x192-f2f7d6f6_20230130.pth Loads checkpoint by http backend from path: https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/cspnext-m_udp-aic-coco_210e-256x192-f2f7d6f6_20230130.pth 08/17 16:08:51 - mmengine - WARNING - "FileClient" will be deprecated in future. Please use io functions in https://mmengine.readthedocs.io/en/latest/api/fileio.html#file-io 08/17 16:08:51 - mmengine - WARNING - "HardDiskBackend" is the alias of "LocalBackend" and the former will be deprecated in future. 08/17 16:08:51 - mmengine - INFO - Checkpoints will be saved to /home/qdwl/Data2/hzh/pose/mmpose/mmpose/work_dirs/rtmpose-m_8xb256-420e_coco-256x192. Traceback (most recent call last): File "tools/train.py", line 162, in main() File "tools/train.py", line 158, in main runner.train() File "/home/qdwl/anaconda3/envs/mmpose-1.x/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1777, in train model = self.train_loop.run() # type: ignore File "/home/qdwl/anaconda3/envs/mmpose-1.x/lib/python3.8/site-packages/mmengine/runner/loops.py", line 96, in run self.run_epoch() File "/home/qdwl/anaconda3/envs/mmpose-1.x/lib/python3.8/site-packages/mmengine/runner/loops.py", line 112, in run_epoch for idx, data_batch in enumerate(self.dataloader): File "/home/qdwl/anaconda3/envs/mmpose-1.x/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 521, in next data = self._next_data() File "/home/qdwl/anaconda3/envs/mmpose-1.x/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1186, in _next_data idx, data = self._get_data() File "/home/qdwl/anaconda3/envs/mmpose-1.x/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1152, in _get_data success, data = self._try_get_data() File "/home/qdwl/anaconda3/envs/mmpose-1.x/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 990, in _try_get_data data = self._data_queue.get(timeout=timeout) File "/home/qdwl/anaconda3/envs/mmpose-1.x/lib/python3.8/multiprocessing/queues.py", line 116, in get return _ForkingPickler.loads(res) File "/home/qdwl/anaconda3/envs/mmpose-1.x/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 289, in rebuild_storage_fd fd = df.detach() File "/home/qdwl/anaconda3/envs/mmpose-1.x/lib/python3.8/multiprocessing/resource_sharer.py", line 58, in detach return reduction.recv_handle(conn) File "/home/qdwl/anaconda3/envs/mmpose-1.x/lib/python3.8/multiprocessing/reduction.py", line 189, in recv_handle return recvfds(s, 1)[0] File "/home/qdwl/anaconda3/envs/mmpose-1.x/lib/python3.8/multiprocessing/reduction.py", line 164, in recvfds raise RuntimeError('received %d items of ancdata' % RuntimeError: received 0 items of ancdata

Additional information

No response

zhenhao-huang avatar Aug 17 '24 08:08 zhenhao-huang