DataLoaders_DALI icon indicating copy to clipboard operation
DataLoaders_DALI copied to clipboard

RuntimeError

Open wuzhi19931128 opened this issue 5 years ago • 0 comments

您好, 在Imagenet.py 读取数据时直接出现了 an illegal memory 的错误,请问是什么原因呢?我的显卡是2 * V100,应该不会出现显存不足的错误呀,源码除了数据集位置没有做任何改变, 以下是错误日志

root@test-6gwz28fvc:/data1/test# python imagenet.py DALI "gpu" variant read 1281167 files from 1000 directories 140020509374208 Exception in thread: CUDA runtime API error cudaErrorIllegalAddress (77): an illegal memory access was encountered Traceback (most recent call last): File "imagenet.py", line 105, in num_threads=4, crop=224, device_id=0, num_gpus=1) File "imagenet.py", line 67, in get_imagenet_iter_dali dali_iter_train = DALIClassificationIterator(pip_train, size=pip_train.epoch_size("Reader") // world_size) File "/usr/local/miniconda3/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 338, in init last_batch_padded = last_batch_padded) File "/usr/local/miniconda3/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 148, in init self._first_batch = self.next() File "/usr/local/miniconda3/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 245, in next return self.next() File "/usr/local/miniconda3/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 163, in next outputs.append(p.share_outputs()) File "/usr/local/miniconda3/lib/python3.6/site-packages/nvidia/dali/pipeline.py", line 409, in share_outputs return self._pipe.ShareOutputs() RuntimeError: Critical error in pipeline: Error in thread 0: CUDA runtime API error cudaErrorIllegalAddress (77): an illegal memory access was encountered Current pipeline object is no longer valid. terminate called after throwing an instance of 'dali::CUDAError' what(): CUDA runtime API error cudaErrorIllegalAddress (77): an illegal memory access was encountered 已放弃 (核心已转储)

能帮忙看一下吗?谢谢

wuzhi19931128 avatar Jan 14 '20 11:01 wuzhi19931128