ControlNet
ControlNet copied to clipboard
Low VRAM tests (now 8GB ok, begin to solve 6gb)
RuntimeError: CUDA out of memory. Tried to allocate 58.00 MiB (GPU 0; 8.00 GiB total capacity; 7.14 GiB already allocated; 0 bytes free; 7.26 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
same here
I will try to add sliced attention this weekend. (and copy some codes from Automatic1111 perhaps.) Right now it seems the model OOM on 8GB gpus. See also https://github.com/CompVis/stable-diffusion/issues/39
Amazing work, very inspiring and i can't wait to try this. For now, just spent all day downloading and troubleshooting on my slow internet, just to find this OOM. I'm on 6gb 1660ti, so i have to run 1111 with --medvram --precision full --no-half.... Any hope at all of this ever working for me, or should i write this one off? Any chance for a Colab notebook for the rest of us?..
Low VRAM mode added.
https://github.com/lllyasviel/ControlNet/blob/main/docs/low_vram.md
Tested on several 8GB cards. Lets see if work on 6GB.
Great, thanks! I have a 12GB vram gpu, does it also work for training ? Currently I can't train, even with a batch size of 1 I get OOM errors
Great, thanks! I have a 12GB vram gpu, does it also work for training ? Currently I can't train, even with a batch size of 1 I get OOM errors
will try. but it should work perhaps since attention layer is sliced already.
I also tried using xformers but I kept having incompatibility issues
It's not working on 6GB GeForce RTX 3060
It's not working on 6GB GeForce RTX 3060
OK we know. Let us solve 6GB now.
Great, thanks! I have a 12GB vram gpu, does it also work for training ? Currently I can't train, even with a batch size of 1 I get OOM errors
Same. Besides batch size, I also tried with accumulate_grad_batches
and save_memory
, but still no luck to train with 12GB vram. I also tried xformers, but I keep getting No operator found for `memory_efficient_attention_backward
.
I also tried xformers, but I keep getting No operator found for
memory_efficient_attention_backward
.
~~I've seen this too and I suspect it was a faulty installation in my case. I had pytorch 1.13.1 from conda & in a conda env, installed xformers via pip, and memory_efficient_attention_backward
was the error. After replacing xformers with its conda release, the error was gone and I was back to oom at the beginning of the training (also on 12GB vram).~~ nvm my conda xformers install had issues.
Hello. Great work with ControlNet. Any updates on 6GB VRAM solutions?
I still have this error save_memory = True
it didn't help much
RuntimeError: CUDA out of memory. Tried to allocate 290.00 MiB (GPU 0; 8.00 GiB total capacity; 6.29 GiB already allocated; 0 bytes free; 7.08 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
@My12123
Maybe you can make save_memory = True
in config.py
if you have not already. That will make at least the canny model to run on 8 GB GPU. I tried it with RTX 3070.
Also, you can install xformers with pip install xformers
. It seems to work pretty well in saving a gigabyte or so of memory and the inference is twice as fast. If you install xformers, just make sure you have PyTorch 1.13.1 (the latest as of now).
The version mentionned in environment.yaml
is 1.12.1 though.
https://github.com/lllyasviel/ControlNet/blob/3e1340d83250c23e972f98900c764c36e5d7bd69/environment.yaml#L9
Yes. But it also works if we use the latest version of PyTorch, along with the latest version of PyTorch lightning.
I am not able to train the network. Even I am using the save memory option. @sovit-123 can you help me.
@engrmusawarali Hi, even with the save memory option you need at least 18 GB of VRAM to train the model. That's what I found out when it was initially released. Not sure if that requirement has changed since then.
have you tried small-scale training. if yes then how to process with it can you guide me @sovit-123
@engrmusawarali I have not tried it yet. But planning to do it soon.
the training doesnt work on 12GB vram rtx 3060