SinGAN icon indicating copy to clipboard operation
SinGAN copied to clipboard

RuntimeError: CUDA out of memory.

Open prashanth31 opened this issue 4 years ago • 6 comments

Can someone help me how to solve the "CUDA out of memory" error ? I think it has to do something with reducing the batch size but I am not sure where in the code I can do that. Here is the full error message

Traceback (most recent call last): File "main_train.py", line 29, in train(opt, Gs, Zs, reals, NoiseAmp) File "c:\Projects\PK\Phd\Paper4_GAN\SinGAN-master\SinGAN\training.py", line 39, in train z_curr,in_s,G_curr = train_single_scale(D_curr,G_curr,reals,Gs,Zs,in_s,NoiseAmp,opt) File "c:\Projects\PK\Phd\Paper4_GAN\SinGAN-master\SinGAN\training.py", line 162, in train_single_scale gradient_penalty.backward() File "c:\ProgramData\Anaconda3\envs\torch\lib\site-packages\torch\tensor.py", line 195, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "c:\ProgramData\Anaconda3\envs\torch\lib\site-packages\torch\autograd_init_.py", line 99, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: CUDA out of memory. Tried to allocate 30.00 MiB (GPU 0; 2.00 GiB total capacity; 1.16 GiB already allocated; 18.86 MiB free; 1.28 GiB reserved in total by PyTorch)

prashanth31 avatar Mar 20 '21 16:03 prashanth31

I was able to train my network by using the CPU instead of the GPU. It took a lot longer but at least it got the job done.

prashanth31 avatar Mar 23 '21 01:03 prashanth31

hey @prashanth31, I was wondering, how did you get it to run on the GPU? what's the command I should use?

ankuroo avatar Apr 07 '21 19:04 ankuroo

I ended up running the cpu only version. Takes a lot of time but at least works.

On Wed, Apr 7, 2021, 3:28 PM Ankur Mahto @.***> wrote:

hey @prashanth31 https://github.com/prashanth31, I was wondering, how did you get it to run on the GPU? what's the command I should use?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tamarott/SinGAN/issues/144#issuecomment-815168333, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA333D5YRJ2ICQYPKQP4L7LTHSW4VANCNFSM4ZQSB7KA .

prashanth31 avatar Apr 07 '21 22:04 prashanth31

I had a similar issue. I was processing a 1024 pixel image (-max_size = 1024) and at about Scale 11, it crashed with the CUDA memory error. I have gone back to 512. The compute node being used is: https://www.nvidia.com/en-gb/geforce/graphics-cards/geforce-gtx-1080-ti/specifications/

metaphorz avatar Jul 03 '21 20:07 metaphorz

@metaphorz How did you go back 512 and where is code for fix? please! than you

vuhungtvt2018 avatar Mar 04 '23 01:03 vuhungtvt2018

This is so long ago I've forgotten. Been using Stable Diffusion through A1111 for most software runs.

metaphorz avatar Mar 04 '23 03:03 metaphorz