ByteTrack icon indicating copy to clipboard operation
ByteTrack copied to clipboard

Stuck at yolox.core.trainer:148 - init prefetcher, this might take one minute or less...

Open abalikhan opened this issue 3 years ago • 2 comments

Hi, @ifzhang Thank you for providing such a great model with each and every detail. I am trying to train ByteTrack on a custom dataset and stuck at this line "yolox.core.trainer:148 - init prefetcher, this might take one minute or less... ". Does anyone, what could be the possible problem?

Thank you for your help.

2022-01-21 00:38:25.411 | INFO | yolox.core.launch:launch_by_subprocess:145 -


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


2022-01-21 00:38:27.978 | INFO | yolox.core.launch:_distributed_worker:184 - Rank 1 initialization finished. 2022-01-21 00:38:27.989 | INFO | yolox.core.launch:_distributed_worker:184 - Rank 2 initialization finished. 2022-01-21 00:38:28.015 | INFO | yolox.core.launch:_distributed_worker:184 - Rank 3 initialization finished. 2022-01-21 00:38:28.020 | INFO | yolox.core.launch:_distributed_worker:184 - Rank 0 initialization finished. 2022-01-21 00:38:33 | INFO | yolox.core.trainer:124 - args: Namespace(batch_size=32, ckpt='/data/stars/user/abali/ByteTrack/pretrained/yolox_x.pth', devices=4, dist_backend='nccl', dist_url=None, exp_file='/data/stars/user/abali/ByteTrack/exps/example/mot/yolox_x_act.py', experiment_name='yolox_x_act', fp16=True, local_rank=0, machine_rank=0, name=None, num_machines=1, occupy=True, opts=[], resume=False, start_epoch=None) 2022-01-21 00:38:33 | INFO | yolox.core.trainer:125 - exp value: ╒══════════════════╤═══════════════════╕ │ keys │ values │ ╞══════════════════╪═══════════════════╡ │ seed │ None │ ├──────────────────┼───────────────────┤ │ output_dir │ './YOLOX_outputs' │ ├──────────────────┼───────────────────┤ │ print_interval │ 1 │ ├──────────────────┼───────────────────┤ │ eval_interval │ 5 │ ├──────────────────┼───────────────────┤ │ num_classes │ 1 │ ├──────────────────┼───────────────────┤ │ depth │ 1.33 │ ├──────────────────┼───────────────────┤ │ width │ 1.25 │ ├──────────────────┼───────────────────┤ │ data_num_workers │ 4 │ ├──────────────────┼───────────────────┤ │ input_size │ (400, 720) │ ├──────────────────┼───────────────────┤ │ random_size │ (18, 32) │ ├──────────────────┼───────────────────┤ │ train_ann │ 'train.json' │ ├──────────────────┼───────────────────┤ │ val_ann │ 'test.json' │ ├──────────────────┼───────────────────┤ │ degrees │ 10.0 │ ├──────────────────┼───────────────────┤ │ translate │ 0.1 │ ├──────────────────┼───────────────────┤ │ scale │ (0.1, 2) │ ├──────────────────┼───────────────────┤ │ mscale │ (0.8, 1.6) │ ├──────────────────┼───────────────────┤ │ shear │ 2.0 │ ├──────────────────┼───────────────────┤ │ perspective │ 0.0 │ ├──────────────────┼───────────────────┤ │ enable_mixup │ True │ ├──────────────────┼───────────────────┤ │ warmup_epochs │ 1 │ ├──────────────────┼───────────────────┤ │ max_epoch │ 80 │ ├──────────────────┼───────────────────┤ │ warmup_lr │ 0 │ ├──────────────────┼───────────────────┤ │ basic_lr_per_img │ 1.5625e-05 │ ├──────────────────┼───────────────────┤ │ scheduler │ 'yoloxwarmcos' │ ├──────────────────┼───────────────────┤ │ no_aug_epochs │ 10 │ ├──────────────────┼───────────────────┤ │ min_lr_ratio │ 0.05 │ ├──────────────────┼───────────────────┤ │ ema │ True │ ├──────────────────┼───────────────────┤ │ weight_decay │ 0.0005 │ ├──────────────────┼───────────────────┤ │ momentum │ 0.9 │ ├──────────────────┼───────────────────┤ │ exp_name │ 'yolox_x_act' │ ├──────────────────┼───────────────────┤ │ test_size │ (400, 720) │ ├──────────────────┼───────────────────┤ │ test_conf │ 0.1 │ ├──────────────────┼───────────────────┤ │ nmsthre │ 0.7 │ ╘══════════════════╧═══════════════════╛ 2022-01-21 00:38:34 | INFO | yolox.core.trainer:131 - Model Summary: Params: 99.00M, Gflops: 197.93 2022-01-21 00:38:34 | INFO | yolox.core.trainer:289 - loading checkpoint for fine tuning 2022-01-21 00:38:36 | WARNING | yolox.utils.checkpoint:27 - Shape of head.cls_preds.0.weight in checkpoint is torch.Size([80, 320, 1, 1]), while shape of head.cls_preds.0.weight in model is torch.Size([1, 320, 1, 1]). 2022-01-21 00:38:36 | WARNING | yolox.utils.checkpoint:27 - Shape of head.cls_preds.0.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.0.bias in model is torch.Size([1]). 2022-01-21 00:38:36 | WARNING | yolox.utils.checkpoint:27 - Shape of head.cls_preds.1.weight in checkpoint is torch.Size([80, 320, 1, 1]), while shape of head.cls_preds.1.weight in model is torch.Size([1, 320, 1, 1]). 2022-01-21 00:38:36 | WARNING | yolox.utils.checkpoint:27 - Shape of head.cls_preds.1.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.1.bias in model is torch.Size([1]). 2022-01-21 00:38:36 | WARNING | yolox.utils.checkpoint:27 - Shape of head.cls_preds.2.weight in checkpoint is torch.Size([80, 320, 1, 1]), while shape of head.cls_preds.2.weight in model is torch.Size([1, 320, 1, 1]). 2022-01-21 00:38:36 | WARNING | yolox.utils.checkpoint:27 - Shape of head.cls_preds.2.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.2.bias in model is torch.Size([1]). 2022-01-21 00:38:36 | INFO | yolox.data.datasets.mot:39 - loading annotations into memory... 2022-01-21 00:38:50 | INFO | yolox.data.datasets.mot:39 - Done (t=14.65s) 2022-01-21 00:38:50 | INFO | pycocotools.coco:88 - creating index... 2022-01-21 00:38:53 | INFO | pycocotools.coco:88 - index created! 2022-01-21 00:39:12 | INFO | yolox.core.trainer:148 - init prefetcher, this might take one minute or less...

abalikhan avatar Jan 20 '22 23:01 abalikhan

I also have this problem,but I don't konw how to solve it.

ht138612 avatar Apr 26 '22 00:04 ht138612

you can try to set -d to 1 that you can tell what occured the errror.

zengjie617789 avatar Jul 08 '22 09:07 zengjie617789

same

shuyu888 avatar Sep 28 '22 02:09 shuyu888

Hi, @ifzhang Thank you for providing such a great model with each and every detail. I am trying to train ByteTrack on a custom dataset and stuck at this line "yolox.core.trainer:148 - init prefetcher, this might take one minute or less... ". Does anyone, what could be the possible problem?

Thank you for your help.

2022-01-21 00:38:25.411 | INFO | yolox.core.launch:launch_by_subprocess:145 -

Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.

2022-01-21 00:38:27.978 | INFO | yolox.core.launch:_distributed_worker:184 - Rank 1 initialization finished. 2022-01-21 00:38:27.989 | INFO | yolox.core.launch:_distributed_worker:184 - Rank 2 initialization finished. 2022-01-21 00:38:28.015 | INFO | yolox.core.launch:_distributed_worker:184 - Rank 3 initialization finished. 2022-01-21 00:38:28.020 | INFO | yolox.core.launch:_distributed_worker:184 - Rank 0 initialization finished. 2022-01-21 00:38:33 | INFO | yolox.core.trainer:124 - args: Namespace(batch_size=32, ckpt='/data/stars/user/abali/ByteTrack/pretrained/yolox_x.pth', devices=4, dist_backend='nccl', dist_url=None, exp_file='/data/stars/user/abali/ByteTrack/exps/example/mot/yolox_x_act.py', experiment_name='yolox_x_act', fp16=True, local_rank=0, machine_rank=0, name=None, num_machines=1, occupy=True, opts=[], resume=False, start_epoch=None) 2022-01-21 00:38:33 | INFO | yolox.core.trainer:125 - exp value: ╒══════════════════╤═══════════════════╕ │ keys │ values │ ╞══════════════════╪═══════════════════╡ │ seed │ None │ ├──────────────────┼───────────────────┤ │ output_dir │ './YOLOX_outputs' │ ├──────────────────┼───────────────────┤ │ print_interval │ 1 │ ├──────────────────┼───────────────────┤ │ eval_interval │ 5 │ ├──────────────────┼───────────────────┤ │ num_classes │ 1 │ ├──────────────────┼───────────────────┤ │ depth │ 1.33 │ ├──────────────────┼───────────────────┤ │ width │ 1.25 │ ├──────────────────┼───────────────────┤ │ data_num_workers │ 4 │ ├──────────────────┼───────────────────┤ │ input_size │ (400, 720) │ ├──────────────────┼───────────────────┤ │ random_size │ (18, 32) │ ├──────────────────┼───────────────────┤ │ train_ann │ 'train.json' │ ├──────────────────┼───────────────────┤ │ val_ann │ 'test.json' │ ├──────────────────┼───────────────────┤ │ degrees │ 10.0 │ ├──────────────────┼───────────────────┤ │ translate │ 0.1 │ ├──────────────────┼───────────────────┤ │ scale │ (0.1, 2) │ ├──────────────────┼───────────────────┤ │ mscale │ (0.8, 1.6) │ ├──────────────────┼───────────────────┤ │ shear │ 2.0 │ ├──────────────────┼───────────────────┤ │ perspective │ 0.0 │ ├──────────────────┼───────────────────┤ │ enable_mixup │ True │ ├──────────────────┼───────────────────┤ │ warmup_epochs │ 1 │ ├──────────────────┼───────────────────┤ │ max_epoch │ 80 │ ├──────────────────┼───────────────────┤ │ warmup_lr │ 0 │ ├──────────────────┼───────────────────┤ │ basic_lr_per_img │ 1.5625e-05 │ ├──────────────────┼───────────────────┤ │ scheduler │ 'yoloxwarmcos' │ ├──────────────────┼───────────────────┤ │ no_aug_epochs │ 10 │ ├──────────────────┼───────────────────┤ │ min_lr_ratio │ 0.05 │ ├──────────────────┼───────────────────┤ │ ema │ True │ ├──────────────────┼───────────────────┤ │ weight_decay │ 0.0005 │ ├──────────────────┼───────────────────┤ │ momentum │ 0.9 │ ├──────────────────┼───────────────────┤ │ exp_name │ 'yolox_x_act' │ ├──────────────────┼───────────────────┤ │ test_size │ (400, 720) │ ├──────────────────┼───────────────────┤ │ test_conf │ 0.1 │ ├──────────────────┼───────────────────┤ │ nmsthre │ 0.7 │ ╘══════════════════╧═══════════════════╛ 2022-01-21 00:38:34 | INFO | yolox.core.trainer:131 - Model Summary: Params: 99.00M, Gflops: 197.93 2022-01-21 00:38:34 | INFO | yolox.core.trainer:289 - loading checkpoint for fine tuning 2022-01-21 00:38:36 | WARNING | yolox.utils.checkpoint:27 - Shape of head.cls_preds.0.weight in checkpoint is torch.Size([80, 320, 1, 1]), while shape of head.cls_preds.0.weight in model is torch.Size([1, 320, 1, 1]). 2022-01-21 00:38:36 | WARNING | yolox.utils.checkpoint:27 - Shape of head.cls_preds.0.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.0.bias in model is torch.Size([1]). 2022-01-21 00:38:36 | WARNING | yolox.utils.checkpoint:27 - Shape of head.cls_preds.1.weight in checkpoint is torch.Size([80, 320, 1, 1]), while shape of head.cls_preds.1.weight in model is torch.Size([1, 320, 1, 1]). 2022-01-21 00:38:36 | WARNING | yolox.utils.checkpoint:27 - Shape of head.cls_preds.1.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.1.bias in model is torch.Size([1]). 2022-01-21 00:38:36 | WARNING | yolox.utils.checkpoint:27 - Shape of head.cls_preds.2.weight in checkpoint is torch.Size([80, 320, 1, 1]), while shape of head.cls_preds.2.weight in model is torch.Size([1, 320, 1, 1]). 2022-01-21 00:38:36 | WARNING | yolox.utils.checkpoint:27 - Shape of head.cls_preds.2.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.2.bias in model is torch.Size([1]). 2022-01-21 00:38:36 | INFO | yolox.data.datasets.mot:39 - loading annotations into memory... 2022-01-21 00:38:50 | INFO | yolox.data.datasets.mot:39 - Done (t=14.65s) 2022-01-21 00:38:50 | INFO | pycocotools.coco:88 - creating index... 2022-01-21 00:38:53 | INFO | pycocotools.coco:88 - index created! 2022-01-21 00:39:12 | INFO | yolox.core.trainer:148 - init prefetcher, this might take one minute or less...

me too,how you fix it

gyh420 avatar Dec 05 '22 08:12 gyh420

same

me too,how you fix it

gyh420 avatar Dec 05 '22 08:12 gyh420

Set the argument of -d to 1 to identify the certain error you made.

zengjie617789 avatar Dec 05 '22 09:12 zengjie617789

image

gyh420 avatar Dec 05 '22 09:12 gyh420

Set the argument of -d to 1 to identify the certain error you made.

you mean this ?

gyh420 avatar Dec 05 '22 09:12 gyh420

No, this is a warning and I mean error. what the message show after you run this?

zengjie617789 avatar Dec 05 '22 09:12 zengjie617789

No, this is a warning and I mean error. what the message show after you run this?

image

gyh420 avatar Dec 05 '22 09:12 gyh420

No, this is a warning and I mean error. what the message show after you run this?

image

gyh420 avatar Dec 05 '22 09:12 gyh420

No, this is a warning and I mean error. what the message show after you run this?

l have two error, which one please?

gyh420 avatar Dec 05 '22 09:12 gyh420

Apparantly, the first error dude!You input img is None. To make sure your image is not corruped.

zengjie617789 avatar Dec 05 '22 09:12 zengjie617789

Apparantly, the first error dude!You input img is None. To make sure your image is not corruped.

is this:An error has been caught in function 'launch', process 'MainProcess' (11136), thread 'MainThread' (44096): ?

you mean l get wrong in the input of image?

gyh420 avatar Dec 05 '22 09:12 gyh420

This one:

No, this is a warning and I mean error. what the message show after you run this?

image

If you meet the error as you start trainning, the error is about your dataloader which formed abnormally. If you meet it after a few minuters within one epoch it is about one image corrupted in your datasets.

zengjie617789 avatar Dec 05 '22 09:12 zengjie617789

This one:

No, this is a warning and I mean error. what the message show after you run this?

image

If you meet the error as you start trainning, the error is about your dataloader which formed abnormally. If you meet it after a few minuters within one epoch it is about one image corrupted in your datasets.

image

gyh420 avatar Dec 05 '22 09:12 gyh420

This one:

No, this is a warning and I mean error. what the message show after you run this?

image

If you meet the error as you start trainning, the error is about your dataloader which formed abnormally. If you meet it after a few minuters within one epoch it is about one image corrupted in your datasets.

thank for you help, now l get this question when l run: python3 tools/track.py -f exps/example/mot/yolox_x_ablation.py -c pretrained/bytetrack_ablation.pth.tar -b 1 -d 1 --fp16 --fuse

image

why?

gyh420 avatar Dec 05 '22 12:12 gyh420