RNAN icon indicating copy to clipboard operation
RNAN copied to clipboard

RuntimeError: CUDA out of memory.

Open Yi-Qi638 opened this issue 5 years ago • 9 comments

I used the 1080Ti 11gb. but when I run the t.test(),sr = self.model(lr, idx_scale) I got the RuntimeError: CUDA out of memory. how can I fix this? thanks

Yi-Qi638 avatar Nov 15 '19 08:11 Yi-Qi638

I also tried on a Tesla V100 GPU with 32Gb memory, but it still out of memory. I woder how did the owner conduct the experiments. Or something else is wrong?

zcpchaos avatar Jan 06 '20 05:01 zcpchaos

I am also having the same issue with 2 GPUs (48 Gb Tesla). It is fascinating that the author could train the model on Titan Xp ?!

Even though I have tried to input 1 batch with 48x48x3. the memory issue remains.

Magauiya avatar Mar 12 '20 16:03 Magauiya

Problem solved! The issue was in torchsummary, where I put 256x256 images. Anyway, the problem is in NLB, which consumes too much memory because of large matrix multiplication: f = torch.matmul(theta_x, phi_x)

Magauiya avatar Mar 13 '20 04:03 Magauiya

Problem solved! The issue was in torchsummary, where I put 256x256 images. Anyway, the problem is in NLB, which consumes too much memory because of large matrix multiplication: f = torch.matmul(theta_x, phi_x)

So any suggestion on how can I run the test code on Titan V?

supratikbanerjee avatar Apr 03 '20 20:04 supratikbanerjee

Problem solved! The issue was in torchsummary, where I put 256x256 images. Anyway, the problem is in NLB, which consumes too much memory because of large matrix multiplication: f = torch.matmul(theta_x, phi_x)

So any suggestion on how can I run the test code on Titan V?

Change batch size to 1, it should work even if the image is 256x256x3 (my case). Otherwise, if you have several Titan V, you should implement model sharding.

Magauiya avatar Apr 06 '20 07:04 Magauiya

I have the same problem in the testing phase, even i use the '--chop' option. Do you know the reason? I use 1 1080 ti GPU. Thanks

jiandandan001 avatar Aug 30 '20 01:08 jiandandan001

I have the same problem in the testing phase, even i use the '--chop' option. Do you know the reason? I use 4 1080 ti GPU. Thanks

chenjiachengzzz avatar Sep 18 '21 11:09 chenjiachengzzz

I have the same problem in the testing phase, even i use the '--chop' option. Do you know the reason? I use 1 1080 ti GPU. Thanks

did you fix this problem

chenjiachengzzz avatar Sep 18 '21 11:09 chenjiachengzzz

AlexHex7/Non-local_pytorch#9 Reference this issue, I solved this problem. This is because the code written by PyTorch 0.3.1 .In PyTorch> 0.4.0 version,volatile is deprecated.Wrap the test code into torch.no_grad().

wanjh1024 avatar Nov 28 '22 02:11 wanjh1024