DSL icon indicating copy to clipboard operation
DSL copied to clipboard

unlabel_pred error and cannot find the images

Open Luojlong opened this issue 3 years ago • 4 comments

batch_mlvl_bboxes /= batch_mlvl_bboxes.new_tensor( [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 5142/5142, 19.1 task/s, elapsed: 269s, ETA: 0s2022-11-24 10:22:08,173 - mmdet - INFO - [INFO] Unlabel pred Done! Traceback (most recent call last): File "tools/train.py", line 202, in main() File "tools/train.py", line 190, in main train_detector( File "/home/hello/PycharmProjects/pythonProject/new_DSL/DSL/mmdet/apis/train.py", line 218, in train_detector runner.run(data_loaders, cfg.workflow) File "/home/hello/PycharmProjects/pythonProject/new_DSL/DSL/mmdet/runner/hooks/semi_epoch_based_runner.py", line 345, in run epoch_runner(data_loaders[i], **kwargs) File "/home/hello/PycharmProjects/pythonProject/new_DSL/DSL/mmdet/runner/hooks/semi_epoch_based_runner.py", line 267, in train self.call_hook('after_train_iter') File "/home/hello/anaconda3/envs/Torch-DSL/lib/python3.8/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook getattr(hook, fn_name)(self) File "/home/hello/PycharmProjects/pythonProject/new_DSL/DSL/mmdet/runner/hooks/unlabel_pred_hook.py", line 460, in after_train_iter self.after_train_iter_func(runner) File "/home/hello/PycharmProjects/pythonProject/new_DSL/DSL/mmdet/runner/hooks/unlabel_pred_hook.py", line 517, in after_train_iter_func assert len(runner.imagefiles) == 2 AssertionError ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 16269) of binary: /home/hello/anaconda3/envs/Torch-DSL/bin/python

when i want to debug and set preload=1,start_point=2 to reduce training time.It occur another error.

loading annotations into memory... Done (t=0.00s) creating index... index created! loading annotations into memory... Done (t=0.13s) creating index... index created! loading annotations into memory... Done (t=0.02s) creating index... index created! loading annotations into memory... Done (t=0.20s) creating index... index created! [ERROR][ModelInfer] Found no image in /home/hello/PycharmProjects/pythonProject/new_DSL/DSL/mydata/semicoco/images/full

but this document has images,i dont know what happen? anybody can help me?Thanks a lot.

Luojlong avatar Nov 24 '22 02:11 Luojlong

@progincline please double check the image path;and set preload=2*num_worker+2 when per_gpu_image_num = 2

chenbinghui1 avatar Nov 24 '22 02:11 chenbinghui1

1 图片路径是没有问题的,num_worker是指workers_per_gpu的值吗,如果是我就没有设错。但问题是只要运行就会报[ERROR][ModelInfer] Found no image in /home/hello/PycharmProjects/pythonProject/new_DSL/DSL/mydata/semicoco/images/full,我昨天也遇到这个问题了,然后重新再配了一遍代码就没事了,但报错后再运行就会报这个错,所以现在有点困惑。

Luojlong avatar Nov 24 '22 03:11 Luojlong

我把train,unlabel_train/pred,val的图片路径都在.sh设好了,就是有点不清楚这个报错是哪个部分出来的

Luojlong avatar Nov 24 '22 03:11 Luojlong

@progincline 我看你用的是自己的数据,所以不太清楚是否生成了对应的semicoco的格式文件, 例如ann_file 这些,可以再check下,或者debug下,打印一下对应位置的信息;

chenbinghui1 avatar Nov 24 '22 10:11 chenbinghui1