YOLOX_OBB init prefetcher, this might take one minute or less...

init prefetcher, this might take one minute or less...

Open zibai4991 opened this issue 2 years ago • 1 comments

hello author

我按照https://zhuanlan.zhihu.com/p/430850089 的步骤走到了模型训练这一步，但是训练脚本跑起来后，一直卡在 “init prefetcher, this might take one minute or less...” ，之后排查了源码发现程序卡在 yolox/data/data_prefetcher.py 26行 self.next_input, self.next_target, _, _ = next(self.loader) 我不确定是不是我的配置参数出了问题，请作者给与建议或帮助

如下是跑起来的打印信息： (yolox_obb) zibai@eng2:~/opt/YOLOX_OBBG$ bash my_exps/train.sh MEF-G exps/example/yolox_obb/yolox_s_MFE-G.py 0 1 16 --fp16

activate env yolox_obb

Current dir is /home/zibai/opt/YOLOX_OBBG exp is exps/example/yolox_obb/yolox_s_MFE-G.py cuda_device is cuda: 0 num_device is 1 batch_size is 16 pth is other args: --fp16 ready train .... 2023-07-19 09:06:31 | INFO | yolox.core.trainer:131 - args: Namespace(batch_size=16, cache=False, ckpt=None, devices=1, dist_backend='nccl', dist_url=None, exp_file='exps/example/yolox_obb/yolox_s_MFE-G.py', experiment_name='MEF-G', fp16=True, machine_rank=0, name=None, num_machines=1, occupy=False, options=None, resume=False, start_epoch=None) 2023-07-19 09:06:31 | INFO | yolox.core.trainer:132 - exp value: ╒═════════════════════╤═══════════════════════════════════════════════════════╕ │ keys │ values │ ╞═════════════════════╪═══════════════════════════════════════════════════════╡ │ seed │ None │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ output_dir │ 'YOLOX_outputs' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ print_interval │ 10 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ eval_interval │ 10 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ modules_config │ 'configs/modules/yoloxs_obb.yaml' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ losses_config │ 'configs/losses/yolox_losses_obb.yaml' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ dataset_config │ 'configs/datasets/MFE-G.yaml' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ data_num_workers │ 4 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ input_size │ (1024, 1024) │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ multiscale_range │ 5 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ mosaic_prob │ 1.0 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ mixup_prob │ 0.0 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ hsv_prob │ 1.0 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ flip_prob │ 0.5 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ degrees │ 10.0 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ translate │ 0.1 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ mosaic_scale │ (0.4, 1.2) │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ mixup_scale │ (0.4, 1.2) │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ shear │ 2.0 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ enable_mixup │ True │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ warmup_epochs │ 1 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ max_epoch │ 500 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ warmup_lr │ 0 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ basic_lr_per_img │ 0.00015625 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ scheduler │ 'yoloxwarmcos' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ no_aug_epochs │ 20 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ min_lr_ratio │ 0.05 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ ema │ True │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ no_eval │ False │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ weight_decay │ 0.0005 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ momentum │ 0.9 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ exp_name │ 'MEF-G' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ test_size │ (1024, 1024) │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ postprocess_cfg │ {'conf_thre': 0.05, 'nms_thre': 0.1} │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ copy_paste_prob │ 1.0 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ enable_debug │ False │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ enable_resample │ True │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ aug_ignore │ None │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ empty_ignore │ True │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ long_wh_thre │ 6 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ short_wh_thre │ 3 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ overlaps_thre │ 0.6 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ evaluate_cfg │ {'is_merge': False, 'is_submiss': False, 'nproc': 10} │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ export_input_names │ ['input'] │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ export_output_names │ ['boxes', 'scores', 'class'] │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ include_post │ True │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ data_dir │ 'datasets/MFE-G/Bbox' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ train_ann │ 'train' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ val_ann │ 'val' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ test_ann │ 'test' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ num_classes │ 8 │ ╘═════════════════════╧═══════════════════════════════════════════════════════╛ 2023-07-19 09:06:31 | INFO | yolox.models.parse_model:18 - overriding modules.yaml num_classes=80 with num_classes=8 /home/zibai/opt/anaconda3/envs/yolox_obb/lib/python3.7/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1639180594101/work/aten/src/ATen/native/TensorShape.cpp:2157.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] 2023-07-19 09:06:31 | INFO | yolox.core.trainer:138 - Model Summary: Params: 8.05M, Gflops: 55.81 2023-07-19 09:06:33 | INFO | yolox.core.trainer:156 - init prefetcher, this might take one minute or less...

Jul 19 '23 01:07 zibai4991

│ empty_ignore │ True │ -> empty_ignore False

zibai4991 @.***> 于2023年7月19日周三 09:23写道：

hello author

我按照https://zhuanlan.zhihu.com/p/430850089 的步骤走到了模型训练这一步，但是训练脚本跑起来后，一直卡在 “init prefetcher, this might take one minute or less...” ，之后排查了源码发现程序卡在 yolox/data/data_prefetcher.py 26行 self.next_input, self.next_target, _, _ = next(self.loader) 我不确定是不是我的配置参数出了问题，请作者给与建议或帮助

如下是跑起来的打印信息： (yolox_obb) @.***:~/opt/YOLOX_OBBG$ bash my_exps/train.sh MEF-G exps/example/yolox_obb/yolox_s_MFE-G.py 0 1 16 --fp16

activate env yolox_obb

Current dir is /home/zibai/opt/YOLOX_OBBG exp is exps/example/yolox_obb/yolox_s_MFE-G.py cuda_device is cuda: 0 num_device is 1 batch_size is 16 pth is other args: --fp16 ready train .... 2023-07-19 09:06:31 | INFO | yolox.core.trainer:131 - args: Namespace(batch_size=16, cache=False, ckpt=None, devices=1, dist_backend='nccl', dist_url=None, exp_file='exps/example/yolox_obb/yolox_s_MFE-G.py', experiment_name='MEF-G', fp16=True, machine_rank=0, name=None, num_machines=1, occupy=False, options=None, resume=False, start_epoch=None) 2023-07-19 09:06:31 | INFO | yolox.core.trainer:132 - exp value:

╒═════════════════════╤═══════════════════════════════════════════════════════╕ │ keys │ values │

╞═════════════════════╪═══════════════════════════════════════════════════════╡ │ seed │ None │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ output_dir │ 'YOLOX_outputs' │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ print_interval │ 10 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ eval_interval │ 10 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ modules_config │ 'configs/modules/yoloxs_obb.yaml' │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ losses_config │ 'configs/losses/yolox_losses_obb.yaml' │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ dataset_config │ 'configs/datasets/MFE-G.yaml' │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ data_num_workers │ 4 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ input_size │ (1024, 1024) │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ multiscale_range │ 5 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ mosaic_prob │ 1.0 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ mixup_prob │ 0.0 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ hsv_prob │ 1.0 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ flip_prob │ 0.5 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ degrees │ 10.0 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ translate │ 0.1 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ mosaic_scale │ (0.4, 1.2) │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ mixup_scale │ (0.4, 1.2) │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ shear │ 2.0 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ enable_mixup │ True │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ warmup_epochs │ 1 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ max_epoch │ 500 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ warmup_lr │ 0 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ basic_lr_per_img │ 0.00015625 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ scheduler │ 'yoloxwarmcos' │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ no_aug_epochs │ 20 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ min_lr_ratio │ 0.05 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ ema │ True │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ no_eval │ False │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ weight_decay │ 0.0005 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ momentum │ 0.9 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ exp_name │ 'MEF-G' │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ test_size │ (1024, 1024) │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ postprocess_cfg │ {'conf_thre': 0.05, 'nms_thre': 0.1} │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ copy_paste_prob │ 1.0 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ enable_debug │ False │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ enable_resample │ True │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ aug_ignore │ None │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ empty_ignore │ True │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ long_wh_thre │ 6 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ short_wh_thre │ 3 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ overlaps_thre │ 0.6 │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ evaluate_cfg │ {'is_merge': False, 'is_submiss': False, 'nproc': 10} │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ export_input_names │ ['input'] │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ export_output_names │ ['boxes', 'scores', 'class'] │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ include_post │ True │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ data_dir │ 'datasets/MFE-G/Bbox' │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ train_ann │ 'train' │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ val_ann │ 'val' │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ test_ann │ 'test' │

├─────────────────────┼───────────────────────────────────────────────────────┤ │ num_classes │ 8 │

╘═════════════════════╧═══════════════════════════════════════════════════════╛ 2023-07-19 09:06:31 | INFO | yolox.models.parse_model:18 - overriding modules.yaml num_classes=80 with num_classes=8 /home/zibai/opt/anaconda3/envs/yolox_obb/lib/python3.7/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1639180594101/work/aten/src/ATen/native/TensorShape.cpp:2157.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] 2023-07-19 09:06:31 | INFO | yolox.core.trainer:138 - Model Summary: Params: 8.05M, Gflops: 55.81 2023-07-19 09:06:33 | INFO | yolox.core.trainer:156 - init prefetcher, this might take one minute or less...

— Reply to this email directly, view it on GitHub https://github.com/DDGRCF/YOLOX_OBB/issues/42, or unsubscribe https://github.com/notifications/unsubscribe-auth/APFM3ABQB4YHNECVEJZCZ7DXQ4ZITANCNFSM6AAAAAA2PE4RR4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Jul 19 '23 10:07 DDGRCF

YOLOX_OBB YOLOX_OBB copied to clipboard

init prefetcher, this might take one minute or less...

如下是跑起来的打印信息： (yolox_obb) @.***:~/opt/YOLOX_OBBG$ bash my_exps/train.sh MEF-G exps/example/yolox_obb/yolox_s_MFE-G.py 0 1 16 --fp16

activate env yolox_obb

YOLOX_OBB
YOLOX_OBB copied to clipboard