PyTorch-YOLOv3 icon indicating copy to clipboard operation
PyTorch-YOLOv3 copied to clipboard

Caught RuntimeError in DataLoader worker process 0 while running test.py for COCO

Open alyssa121 opened this issue 5 years ago • 8 comments

command: python test.py --weights_path yolov3.weights

error: Namespace(batch_size=8, class_path='data/coco.names', conf_thres=0.001, data_config='config/coco.data', img_size=416, iou_thres=0.5, model_def='config/yolov3.cfg', n_cpu=8, nms_thres=0.5, weights_path='yolov3.weights') Compute mAP... Detecting objects: 0%| | 0/625 [00:00<?, ?it/s]Traceback (most recent call last): File "test.py", line 99, in batch_size=8, File "test.py", line 36, in evaluate for batch_i, (_, imgs, targets) in enumerate(tqdm.tqdm(dataloader, desc="Detecting objects")): File "/home/bobo/anaconda3/envs/detectron/lib/python3.6/site-packages/tqdm/_tqdm.py", line 1022, in iter for obj in iterable: File "/home/bobo/anaconda3/envs/detectron/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 819, in next return self._process_data(data) File "/home/bobo/anaconda3/envs/detectron/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data data.reraise() File "/home/bobo/anaconda3/envs/detectron/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise raise self.exc_type(msg) RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/bobo/anaconda3/envs/detectron/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/home/bobo/anaconda3/envs/detectron/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch return self.collate_fn(data) File "/home/bobo/dxx/PyTorch-YOLOv3/utils/datasets.py", line 149, in collate_fn targets = torch.cat(targets, 0) RuntimeError: expected a non-empty list of Tensors

alyssa121 avatar Oct 09 '19 03:10 alyssa121

i have same problem,how do you solve this problem?

hanrui15765510320 avatar Nov 28 '19 09:11 hanrui15765510320

i have same problem,how do you solve this problem?

wydonglove avatar Dec 02 '19 03:12 wydonglove

hey, have u guys solved the problem yet?

JingshanXu avatar Dec 21 '19 02:12 JingshanXu

hey, have u guys solved the problem yet?

QZ-cmd avatar Mar 11 '20 12:03 QZ-cmd

I think it might actually be a memory issue but haven't confirmed it yet. I'm seeing this when I load a large number of training samples in at once, but my code runs fine on smaller sets.

This is definitely a PyTorch issue though, not unique to this repo.

ss32 avatar Mar 23 '20 01:03 ss32

I think it might actually be a memory issue but haven't confirmed it yet. I'm seeing this when I load a large number of training samples in at once, but my code runs fine on smaller sets.

This is definitely a PyTorch issue though, not unique to this repo.

Confirmed it's a memory issue. Once training starts I see my available memory drop steadily as the epoch progresses. I'm training using the GPU but something is being stored in regular memory during each epoch.

ss32 avatar Mar 23 '20 01:03 ss32

doing on colab but still getting error

osaka004 avatar Oct 08 '20 11:10 osaka004

Has anyone been able to solve this issue?

anas-zafar avatar Mar 21 '22 05:03 anas-zafar