BigGAN-PyTorch
BigGAN-PyTorch copied to clipboard
Out of memory
Why the program can run in the training, but stuck in saving weight?
Saving weights to /data1/code/BigGAN-PyTorch/weights/BigGAN_I512_seed0_Gch96_Dch96_bs8_nDa8_nGa8_Glr1.0e-04_Dlr4.0e-04_Gnlinplace_relu_Dnlinplace_relu_Ginitortho_Dinitortho_Gattn64_Dattn64_Gshared_hier_ema... Saving weights to /data1/code/BigGAN-PyTorch/weights/BigGAN_I512_seed0_Gch96_Dch96_bs8_nDa8_nGa8_Glr1.0e-04_Dlr4.0e-04_Gnlinplace_relu_Dnlinplace_relu_Ginitortho_Dinitortho_Gattn64_Dattn64_Gshared_hier_ema/copy0...
RuntimeError: CUDA out of memory. Tried to allocate 3.75 GiB (GPU 0; 11.91 GiB total capacity; 6.14 GiB already allocated; 416.75 MiB free; 10.94 GiB reserved in total by PyTorch)
even though the Occupation rate before is only 50%, after the step 1000 ,it still out of memory. I change the code in the utils, Device('cuda') -> Device('cpu') it seems nothing
I have the same problem,do you have solved it? If you have solved it, can you tell me how to solve this problem. Thank you very much!!!!!!!
even though the Occupation rate before is only 50%, after the step 1000 ,it still out of memory. I change the code in the utils, Device('cuda') -> Device('cpu') it seems nothing
The default input saves the model every 1000 steps and generates sample images at that moment, which takes considerable extra GPU memory. So I think you can avoid the issue by disable the sample generation. You can look at the --test_every and --save_every params.