sd_dreambooth_extension
sd_dreambooth_extension copied to clipboard
Is Stable Diffusion 2.0 working for any of you 12GB users?
I tried using the 768 2.0 model
Have you read the Readme? Yes
Have you completely restarted the stable-diffusion-webUI, not just reloaded the UI? Yes
Have you updated Dreambooth to the latest revision? Yes fresh install today
Have you updated the Stable-Diffusion-WebUI to the latest version? Yes
I wanted to know if SD 2.0 for training works for 12gb vram users? If so, do we just follow the guide on your front page of this repo?
Environment
What OS? Windows
If Windows - WSL or native? Native
What GPU are you using? Nvidia 2080 ti
I can't (with text encoder)
I can't (with text encoder)
I also cannot do text encoding and I tested sample images with 1.5 vs. 2.0 and 2.0 doesn't even make anyone who looks like the trainee.
Should I also come and mention? 2.1 is training but without Text-Encoder. Selecting Text Encoder gives a CUDA memory error. Windows 10, RTX 3060 12GB, FB16, xFormers,
Should I also come and mentions? 2,1 is training but without Text-Encoder. Selecting Text Encoder gives a CUDA memory error. Windows 10, RTX 3060 12GB, FB16, xFormers,
Yep. If only text encoder weren't that important.
Try with the latest version, please. Should be possible with fp16/adam/xformers or flash attention on 12gb.
I just tried the latest commit, 12GB 3060, win10. EMA with Text Training OOM.
LORA with Text Training is running , I'll leave it running overnight, but now I'm super confused if LORA is actually doing text training due to comments in this thread by author of LORA git: https://github.com/cloneofsimo/lora/issues/16
Yeah I can't with text encoder. 3060 12GB. Killed every process possible using GPU RAM - even explorer.exe - still not enough.
Tried EMA/no Ema, fp16, xformers/flash_attention, don't cache, gradient checkpointing. I haven't looked into this LORA thing much yet, guess I better.
edit - LORA let it run but results are terrible compared to my normal v1 512x model training. Used the same dataset but at 768x, and used redshift 768x model instead of the old redshift 512x. Saw a comment on here to use 1e-4 for LORA vs 1e-6 for normal Dreambooth, so that's what I tried first. Results were awful. Then tried 1e-6, better, but still nowhere near as good as my non-LORA SD1 training.
LORA with Text Training is running , I'll leave it running overnight, but now I'm super confused if LORA is actually doing text training due to comments in this thread by author of LORA git: cloneofsimo/lora#16
Interested to hear if you fare any better.
Lora doesn't train with text encoder yet!
Reporting back, latest build, 3060 12g/win10, LORA, 10K saved at 1K intervals, 62 Images Concept, 1.1e-6, 260 class images. v2.1/512res, correcting scheduler_config/_lora.yaml, finished without trouble. Concept was not well learned.
one comment, I did an earlier LORA training without class preservation and the subject was much better, still not good enough. I'll keep trying with different conditions.
@raymondgp I've found that using large learning rate helps a lot! Use around 5e-5 ~ 4e-4 to see meaningful effect!
@raymondgp Now training with text-encoder is available!
@cloneofsimo Indeed! I ran the first 10K with Lora and text training, 4e-4, my subject is there, not fully learned, but I will work on improving this.
Amazing, thank you both for your hard work, I'll report back to Reddit after a few tests, I've been following your comments there
@cloneofsimo Indeed! I ran the first 10K with Lora and text training, 4e-4, my subject is there, not fully learned, but I will work on improving this.
Amazing, thank you both for your hard work, I'll report back to Reddit after a few tests, I've been following your comments there
Youre welcome, thank you for trying my stuff out haha 😄 would these results be helpful to you? https://github.com/cloneofsimo/lora#tips-and-discussions
This issue is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days