BigGAN-PyTorch icon indicating copy to clipboard operation
BigGAN-PyTorch copied to clipboard

Out of memory

Open diaodeyi opened this issue 3 years ago • 2 comments

Why the program can run in the training, but stuck in saving weight?

Saving weights to /data1/code/BigGAN-PyTorch/weights/BigGAN_I512_seed0_Gch96_Dch96_bs8_nDa8_nGa8_Glr1.0e-04_Dlr4.0e-04_Gnlinplace_relu_Dnlinplace_relu_Ginitortho_Dinitortho_Gattn64_Dattn64_Gshared_hier_ema... Saving weights to /data1/code/BigGAN-PyTorch/weights/BigGAN_I512_seed0_Gch96_Dch96_bs8_nDa8_nGa8_Glr1.0e-04_Dlr4.0e-04_Gnlinplace_relu_Dnlinplace_relu_Ginitortho_Dinitortho_Gattn64_Dattn64_Gshared_hier_ema/copy0...

RuntimeError: CUDA out of memory. Tried to allocate 3.75 GiB (GPU 0; 11.91 GiB total capacity; 6.14 GiB already allocated; 416.75 MiB free; 10.94 GiB reserved in total by PyTorch)

diaodeyi avatar Apr 06 '21 09:04 diaodeyi

even though the Occupation rate before is only 50%, after the step 1000 ,it still out of memory. I change the code in the utils, Device('cuda') -> Device('cpu') it seems nothing

diaodeyi avatar Apr 07 '21 06:04 diaodeyi

I have the same problem,do you have solved it? If you have solved it, can you tell me how to solve this problem. Thank you very much!!!!!!!

Wanghd-MVP avatar Aug 25 '21 13:08 Wanghd-MVP

even though the Occupation rate before is only 50%, after the step 1000 ,it still out of memory. I change the code in the utils, Device('cuda') -> Device('cpu') it seems nothing

The default input saves the model every 1000 steps and generates sample images at that moment, which takes considerable extra GPU memory. So I think you can avoid the issue by disable the sample generation. You can look at the --test_every and --save_every params.

sevenquarkoniums avatar Jan 09 '22 09:01 sevenquarkoniums