dreambooth-gui
dreambooth-gui copied to clipboard
deep-speed integration
Is your feature request related to a problem? Please describe. Yes, i am not able to complete training by a small margin. by adding --sample_batch_size=1 to the additional parameters i am able to process the source images, yet when it comes to training itself vram is not enough by seemingly a small bit.
Describe the solution you'd like see https://github.com/ShivamShrirao/diffusers/tree/main/examples/dreambooth#training-on-a-8-gb-gpu by using deep-speed adam vram usage can be lowered, while ram usage will increase. This would be an ideal solution for me, however it should be an optional thin (own branch?) as it is not compatible with --use8-bit-adam (again see the link)
Describe alternatives you've considered As an absolute beginner i've had a journey of repos, and landed here. This is the alternative.
Hey, if you don't have enough rams. Do you want to try to train stable diffusion with LORA? It requires way less memory and could give you nearly the same results.
LORA is fully supported in the dreambooth a111 extension. https://github.com/d8ahazard/sd_dreambooth_extension
UPDATE2: not succeeded, have to experiment some more, ig however this will use accelerate, it is already included in the repo you pull from https://github.com/Lolzen/dreambooth-gui/commit/518d952e98c034db54783e5374971c6410691013
UPDATE: accelerate command is seemingly successfully issued. there are warnings, but it does OOM regardless (due to missing config file for offloading, i am suspecting. Working on adding that) I'll let you know if i have succeeded or not within the next few hours, rn i'm pushing a new docker file with hopefully my accelerate config file copied to HF_HOME (accelerate documentation says it looks for the config file in HF_HOME or .cache/huggingface per default, without having to point to the file, let's see)
-- thanks for the suggestion. i tried using LORA training aswell, but it won't finish either. I am having trouble using accelerate in A1111 as there is some mismatch which i haven't been able to figure out where to chenge the config. (note: not the default accelerate config itself)
I have experimented with a fork of your (thius) repo and can use torch 2.0.1 withoput problems, the only thing blocking is actually using accelerate to be able to offload things to the CPU.
I do have 32GB of RAM, but only 8GB of vram, which is the deal breaker.
Reference: https://github.com/Lolzen/dreambooth-gui/tree/deep-speed https://hub.docker.com/repository/docker/lolzen/dreambooth/tags?page=1&ordering=last_updated
(please do keep in mind, i have basically no clue what i'm doing with docker, the whole concept is new to me)