sd_dreambooth_extension
sd_dreambooth_extension copied to clipboard
RuntimeError: CUDA out of memory
My configs:
32GB RAM RTX 3070 TI
I always get this error, can anyone help me with some optimization I can perform to avoid this error?
RuntimeError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 8.00 GiB total capacity; 7.09 GiB already allocated; 0 bytes free; 7.23 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
What are you trying to do when you're running out of memory?
Are you using LORA?
Are you using LORA?
I enabled this and now it seems to be flowing, training is in progress, will report back if I succeed
Are you using LORA?
What exactly does LORA do?
https://github.com/cloneofsimo/lora It is a smaller approximation of dreambooth. Doesn't work properly with gradient checkpointing for now, so don't waste time training with that on.
Without LORA you can't train at all with 8GB VRAM card.
Note that the reason it 'runs out of memory' is that the windows driver model WDDM2 reserves 8-15% of the VRAM for itself even when the card is dedicated if you have two GPUs (such as integrated graphics and a dedicated card - a common setup in most gaming laptops). So you can't use all of your VRAM for training. Also the pytorch memory gets fragmented so even if you have free memory available that isn't reserved by Windows, the ram can't be allocated because a large enough chunk isn't contiguous.
I have 1080TI and I'm able to train v1.5 512 with or without lora currently
I have 1080TI and I'm able to train v1.5 512 with or without lora currently
Sorry, deleted my comment which you were replying to, haha. Immediately after writing it I tried to start training for the n-th time and now everything just works so... I'll have to chalk the 24 hours of failed tests up to ghosts in the computer for now.
@kitoide , I have a 3090 Ti with 24GB of VRAM. I have to turn on "8bit Atom" under Parameters->Advanced. Make sure you leave "Gradient Checkpointing" enabled. These are both memory optimizations, sacrificing speed. I'm also using "--xformers" startup option, but not sure if that's neccessary. Finally, make sure both "Batch Size" and "Class Batch Size" are set to 1. Increasing the batch sizes uses more VRAM.
How much VRAM do you need to train the new SD 2.0 and 2.1 models? I run out of memory with 12GB VRAM. I tried all memory attention options. I have all VRAM saving options enabled except Train Text encoder and Use CPU. Still cannot get it to train. Anyone here successfully gotten training to start on 2.x? with 12GB vram?
Note that the reason it 'runs out of memory' is that the windows driver model WDDM2 reserves 8-15% of the VRAM for itself even when the card is dedicated if you have two GPUs (such as integrated graphics and a dedicated card - a common setup in most gaming laptops). So you can't use all of your VRAM for training. Also the pytorch memory gets fragmented so even if you have free memory available that isn't reserved by Windows, the ram can't be allocated because a large enough chunk isn't contiguous.
I turned off my integrated one.
@kitoide, eu tenho um 3090 Ti com 24 GB de VRAM. Eu tenho que ativar "8bit Atom" em Parâmetros->Avançado. Certifique-se de deixar "Gradient Checkpointing" ativado. Ambas são otimizações de memória, sacrificando a velocidade. Também estou usando a opção de inicialização "--xformers", mas não tenho certeza se isso é necessário. Por fim, certifique-se de que "Batch Size" e "Class Batch Size" estejam definidos como 1. Aumentar os tamanhos dos lotes usa mais VRAM.
I do all this, to no avail!
There should be some solution for training to be done with what is available, avoiding memory overflow
I'll be following this thread. I heard people saying they trained using a 1080, there must be something we can do lol. But at the same time it's still pretty inconsistent, since the people I saw training with 8gb had the same settings as I did. (I'm running on a rtx3070)
Som recent regressions on VRAM usage, was able to train a week ago with LORA, and now I can't train anything.
This issue is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days
Som recent regressions on VRAM usage, was able to train a week ago with LORA, and now I can't train anything.
New version out. Give it a go.
I am unable to train with Lora now again, this is first time I've tried Lora so I don't know if it's related I am on 8GBs as well
Sorry for bothering
I am unable to train with Lora now again, this is first time I've tried Lora so I don't know if it's related I am on 6GBs as well
Sorry for bothering