deep-daze icon indicating copy to clipboard operation
deep-daze copied to clipboard

CUDA runs out of memory

Open jnelson16 opened this issue 4 years ago • 6 comments

Has anyone run into a running out of GPU memory issue when running the imagine command? Below is the error I get.

RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 6.00 GiB total capacity; 4.47 GiB already allocated; 716.80 KiB free; 4.48 GiB reserved in total by PyTorch)

I tried to use both gc.collect() and torch.cuda.empty_cache() but neither worked.

jnelson16 avatar Jan 22 '21 00:01 jnelson16

I just reduced num_layers to 16 (I've got 8 GB of dedicated memory). However it would be nice to use the shared memory too (if it's possible here).

bord81 avatar Jan 22 '21 10:01 bord81

Has anyone run into a running out of GPU memory issue when running the imagine command? Below is the error I get.

RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; **6.00 GiB** total capacity; 4.47 GiB already allocated; 716.80 KiB free; 4.48 GiB reserved in total by PyTorch)

I tried to use both gc.collect() and torch.cuda.empty_cache() but neither worked.

Edit: Just realized you're working with 6 GiB of VRAM. Have you considered using the google colab notebook instead? If it's your first time they tend to give you a decent GPU with 16 GiB of memory. Not to mention it's free (unless you're using it alot).

You can check your GPU's memory usage with nvidia's CLI tool nvidia-smi which is provided with the cuda toolkit.

This unfortunately comes with the territory. The code runs best on a graphics card with 16 GiB. If you've got less than that, here are some parameters you can change to lower your VRAM usage.

decrease --image_width

This one lowers memory usage a lot. I've even done this on a 16 GiB colab instance just so I could run 64 hidden layers on a 256px image. Just keep in mind that 512 is a decent default. You'll probably want to decrease in multiples of 8 I assume, but I dunno, maybe it doesn't matter. At any rate the obvious tradeoff here is that you'll get a less detailed output.

lower --batch_size (default is 4).

increase --gradient_accumulate_every (default is 4).

This will generate more images before calculating the loss and running backpropagation (the memory intensive bit). Each images loss is divided by the e.g. the default, 4, because we don't want to punish the network without giving it a chance to change.

decrease --num_layers (default is 32)

This one is basically a requirement on a GPU with less than 16 GiB of memory. The default of 32 is meant for Colab users and is honestly a bit high considering the consumer GPU space doesn't tend to have cards more than 8 GiB of vram. Lowering to 16 will get you below 8 GiB of vram but the results will be more abstract and silly. If you do decrease this value make sure you only do it as much as you need and no more, because more hidden layers seems to help quite a bit.

afiaka87 avatar Jan 23 '21 04:01 afiaka87

@afiaka87 Thanks for your detailed response, that is very helpful. I will check out the Colab notebook, thanks. I've been using an EC2 to play around with this, but that gets expensive pretty quickly.

jnelson16 avatar Jan 25 '21 18:01 jnelson16

@jnelson16 I've been using an EC2 to play around with this, but that gets expensive pretty quickly.

No kidding! Colab is still the cheapest possible option, but I've found vast.ai to have really good pricing and some awesome RTX 3090 (24 GiB VRam) instances. I don't think they have any security guarantees whatsoever though, so that's the tradeoff. But if you're just generating stuff with deep-daze, it can be awesome.

afiaka87 avatar Jan 31 '21 03:01 afiaka87

@afiaka87 hello I am new to AI and python in general. I tried running the deep daze but also ran out of memory just like @jnelson16. I'm doing this all in Powershell. I tried typing in

imagine "one"

and got this back

CUDA out of memory. Tried to allocate 128.00 MiB (GPU 0; 6.00 GiB total capacity; 4.10 GiB already allocated; 96.50 MiB free; 4.12 GiB reserved in total by PyTorch)

I know that I need to decrease the settings on what the image is generating so I typed in

--image_width=256

but it seems I have the wrong syntax. Would you mind explaining how I would type these commands into my terminal to get the right outcome?

@bord81 if you also wouldn't mind sharing exactly what I would need to type to reduce num_ layers to 16 that would be divi

cagi2000 avatar Apr 15 '21 04:04 cagi2000

Hello i get this strange error the hole time ,but there is still spaces on my gpu ram left . I tried using --numlayers 16 but the error is still the same. And i used --batch_size 2 but they both didn`t helped me

RuntimeError: CUDA out of memory. Tried to allocate 128.00 MiB (GPU 0; 4.00 GiB total capacity; 2.47 GiB already allocated; 0 bytes free; 2.49 GiB reserved in total by PyTorch)

alien-einstein avatar Jun 28 '21 16:06 alien-einstein