MC-GAN icon indicating copy to clipboard operation
MC-GAN copied to clipboard

RuntimeError: CUDA error: out of memory

Open damtharvey opened this issue 6 years ago • 6 comments

How much memory do I need for Capitals64? I have 11 GB.

damtharvey avatar Nov 27 '18 20:11 damtharvey

That should be enough. Do you have a batch size larger than 150?

azadis avatar Nov 28 '18 08:11 azadis

Sorry for the slow reply. I will try again when I get my GPU back after NIPS.

damtharvey avatar Nov 30 '18 08:11 damtharvey

I have 11GB too. I tried to train Capitals64 with batch size=150, but CUDA out of memory. There's no problem with batch size=64. (batch size=120 will fails too. I haven't try size between 64 and 120)

sesebuckin avatar Dec 04 '18 08:12 sesebuckin

I'm back. Trying it again, it seems like it doesn't use my GPU anymore.

I found the previous base_options.py and reverted to it. Tried it again with batch sizes 64 and 1 and still get CUDA error: out of memory.

damtharvey avatar Jan 14 '19 02:01 damtharvey

the issue of not using GPU is fixed now! I am not sure about your cuda out of memory error. Can yoy provide more details?

azadis avatar Jan 16 '19 02:01 azadis

@azadis How can I solve this problem?

Total number of parameters: 291649

model [cGANModel] was created create web directory ./checkpoints/GlyphNet_pretrain/web... Traceback (most recent call last): File "train.py", line 32, in model.optimize_parameters() File "/home/ltq/FontTransfer/MC-GAN/models/cGAN_model.py", line 242, in optimize_parameters self.backward_G() File "/home/ltq/FontTransfer/MC-GAN/models/cGAN_model.py", line 224, in backward_G self.loss_G.backward() File "/usr/local/lib/python2.7/dist-packages/torch/tensor.py", line 93, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/usr/local/lib/python2.7/dist-packages/torch/autograd/init.py", line 90, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: CUDA error: out of memory

leitianqi avatar May 14 '19 03:05 leitianqi