Pointnet2.PyTorch
Pointnet2.PyTorch copied to clipboard
CUDA: an illegal memory access was encountered.
@sshaoshuai 您好,非常感谢您提供的代码,我使用的时候出了一些问题,想向您请教一下。我成功地执行了 python setup.py install 代码,CUDA的OP也可以成功调用,但是训练过了几个epochs就会出现一个问题:CUDA: an illegal memory access was encountered. (每次训练的报错行数不一样,但是最后的报错一样)经过排查我觉得可能与CUDA和里面的线程分配有关,而且这个代码我之前几个月用着都没问题,这个问题是最近出现的。请问您是否遇到过类似的情况?下面的是一次报错的具体信息。非常感谢。
e=/pytorch/aten/src/THC/THCCachingHostAllocator.cpp line=265 error=77 : an illegal memory access was encountered THCudaCheck FAIL file=/pytorch/aten/src/THC/THCCachingHostAllocator.cpp line=265 error=77 : an illegal memory access was encountered Traceback (most recent call last): File "train_semseg.py", line 228, in <module> main(args) File "train_semseg.py", line 171, in main pred = model(points[:,:3,:],points[:,3:,:],target) File "/home/zjh/anaconda2/envs/pose/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "/home/zjh/anaconda2/envs/pose/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/home/zjh/anaconda2/envs/pose/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/home/zjh/anaconda2/envs/pose/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 83, in parallel_apply raise output File "/home/zjh/anaconda2/envs/pose/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 59, in _worker output = module(*input, **kwargs) File "/home/zjh/anaconda2/envs/pose/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "/home/zjh/xumingye_temp/SEMSEG_GSNet_dense_xj-xi,xi_deep_+addEdge_2loss/SemSeg_GS.py", line 617, in forward _,idx0_4, idx4 = eigen_Net(N4_points.permute(0,2,1).contiguous(), k=self.k) File "/home/zjh/xumingye_temp/SEMSEG_GSNet_dense_xj-xi,xi_deep_+addEdge_2loss/SemSeg_GS.py", line 129, in eigen_Net feature = torch.cat(( x, eigen), dim=2) RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /pytorch/aten/src/THC/THCCachingHostAllocator.cpp:265