TransGAN icon indicating copy to clipboard operation
TransGAN copied to clipboard

One error

Open wudiduojimone opened this issue 3 years ago • 7 comments
trafficstars

RuntimeError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 7.94 GiB total capacity; 7.03 GiB already allocated; 144.81 MiB free; 7.12 GiB reserved in total by PyTorch)

I meet this question after i print "python exps/cifar_train" in Terminal ,and it apears after "path:logs/cifar_train_2022_03_22_19_29_36 0%| |0/1563 [00:00<?, it/s]" . I know this means the CUDA is out of memory, but i only run this one program ,and the image has not loaded. Did author or someone also meet this question and how did you deal with it?

wudiduojimone avatar Mar 22 '22 11:03 wudiduojimone

what's your batch size and what's the number of gpu you used?

yifanjiang19 avatar Mar 30 '22 20:03 yifanjiang19

I have the same problem with only one GPU.

Nanboy-Ronan avatar Apr 10 '22 22:04 Nanboy-Ronan

@Nanboy-Ronan What's your batch size?

yifanjiang19 avatar Apr 14 '22 02:04 yifanjiang19

Thank you for reply. My training batch_size is 64. Probably I should adjust it? But I see there is people saying that they train with 8 gpus for 1.5 days, which made me hesitate in continue doing this.

Nanboy-Ronan avatar Apr 14 '22 02:04 Nanboy-Ronan

Thanks for your patient reply, and I'm sorry for I saw it just now. Maybe I also have the problem like Nanboy-Ronan for I only have one GPU.

wudiduojimone avatar Apr 14 '22 02:04 wudiduojimone

@Nanboy-Ronan one gpu is not able to train it. But if you still one to train it, you should tune your batch size to a much smaller size.

yifanjiang19 avatar Apr 17 '22 18:04 yifanjiang19

May i know how to do this this was my major project.