pytorch3d icon indicating copy to clipboard operation
pytorch3d copied to clipboard

RuntimeError: CUDA error: an illegal memory access was encountered

Open mmsaban opened this issue 2 years ago • 2 comments

I am using Google Colab with the following package versions: pytorch3d 0.6.2 torch 1.11.0+cu113 Python 3.7.13

It occurs to me that I have recently getting CUDA errors that I could not understand. After analyzing it, I came to the conclusion that it's triggered by the pytorch3d.ops.sample_farthest_points(), and actually only whenever I use a batch_size of 2. Besides using a batch_size higher than 2, how can I prevent this error from happening or is it trivial to always use higher batch_sizes than 2?

I really appreciate any help you can provide.


RuntimeError Traceback (most recent call last)

in () 19 20 # forward + backward + optimize ---> 21 outputs = net(inputs) 22 outputs = torch.squeeze(outputs) 23

3 frames

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 1108 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1109 or _global_forward_hooks or _global_forward_pre_hooks): -> 1110 return forward_call(*input, **kwargs) 1111 # Do not call functions when jit is used 1112 full_backward_hooks, non_full_backward_hooks = [], []

in forward(self, E) 52 # Y = self.GX_2.execute(E, Y) 53 Y = self.LX_3.execute(E, Y) ---> 54 E, Y = self.f_sample_1.execute(E, Y) 55 # Y = self.GX_3.execute(E, Y) 56 # Y = self.LX_4.execute(E, Y)

in execute(self, E1, Y1, ratio) 235 len_out = int(E1.shape[2] / ratio) 236 --> 237 N, Kk = pytorch3d.ops.sample_farthest_points(E1, K=len_out) # Kk := batch, N/ratio, # N := batch, 4, N/ratio K=(int(E.shape[2]/ratio)) 238 print("N.shape, Kk.shape, Y.shape, E.shape", N.shape, Kk.shape, Y1.shape, E1.shape) 239 E_hat = E1[0,:,Kk[0]][None,:,:]

/usr/local/lib/python3.7/dist-packages/pytorch3d/ops/sample_farthest_points.py in sample_farthest_points(points, lengths, K, random_start_point) 86 with torch.no_grad(): 87 # pyre-fixme[16]: pytorch3d_._C has no attribute sample_farthest_points. ---> 88 idx = _C.sample_farthest_points(points, lengths, K, start_idxs) 89 sampled_points = masked_gather(points, idx) 90

RuntimeError: CUDA error: an illegal memory access was encountered

mmsaban avatar Jul 06 '22 10:07 mmsaban

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Aug 06 '22 05:08 github-actions[bot]

I don't think I can replicate this. Can you? What is the shape and dtype of E1? What is len_out?

bottler avatar Aug 09 '22 14:08 bottler