fast-stable-diffusion
fast-stable-diffusion copied to clipboard
Has something been changed?
Hi, I have been training faces on Realistic Vision model for about a week and results were always good, but today something is wrong, it seems that after loading trained model to stable diffusion it just generates photos based on my instance images and prompts wont work as it was before. I thought that I overtrained it, tried again with 25 images and lesser steps same results, even tried with 10 pics and still the same. I usually train train face with 30-70 photos with 5e6 unet training steps and 1e6 Text Encoder Learning Rate, yesterday I tried with 2e-6 unet and 1e-6 text and results were amazing. Today it just seems broken, even tried to train face over another model, still the same. Anyone had the same problem?
Sorry for my eng :)
And what is this? First time see it
And what is this? First time see it
That's just the model downloading
Try the latest colab, ghd one you're using might be broken
I'm having the same issue today, even with an empty prompt I'm getting only variations of my instance images 🤷🏼♂️
Try the latest colab, ghd one you're using might be broken
Well, I always used the latest verison, even checked it today. Tried again and same awfull results. Idk...
Try the latest colab, ghd one you're using might be broken
Hello! The problem is from one of your updates from nowadays. I know 100% that the commit "e59c0fc - fix RAM problem for good, thanks to @Daviljoe193" was working very good. I don't really know what you did after that, but yea, now the model gives only the used images in training (distorted and weird). I think that the solution is very simple. Just revert everything to that last working commit.
Edit: With exactly same settings as before, the models are now a joke.
Oh, yes RAM is fine now! Been having fun with merges again 👍
I'm having the same issue today, even with an empty prompt I'm getting only variations of my instance images 🤷🏼♂️
Yes. I'm using the lastest colab and today I have the same problem, the prompt does not work properly. Only generates rare variations of the instance images.
Also having this issue fwiw.
As I said, the easiest solution (for Ben) will be to revert everything he did in the last 2 days. I don't know what he did, but even the checkpoint merger doesn't work anymore ("cuda out of memory" error for the same models with which I tried a few days ago when everything worked very well).
I thought I was losing my mind. Glad it wasn't me, though, I have burnt through a lot of Colab compute thinking it was. Should have checked here first!
Since I got mentioned here (I guess), I'll weigh in on this issue.
People say that the last "good" commit was the one where my suggestion for fixing the memory leak was applied. Let's see what happened with those 14 commits from between then and now, commit by commit. Note, I'm NOT a developer here, despite my mention in a few commits, I'm just a clown that happens to look like a developer.
Click to expand, it's a lot
Commit 1
Pretty easy here. The learning rate for UNet training was lowered from 5e-6 for 1000 steps to 2e-6 for 1500 steps. Realistically, this should reduce any weird artifacting, and reduce the likelyhood of overfitting, at the cost of a small increase in training time.
Commit 2-3
Here isn't too hard to explain either. A tarball containing the dependencies for Dreambooth/Automatic1111 was updated to use Python 3.10, instead of Python 3.9, and while the old tarball is still there on Huggingface, there shouldn't be anything here that could break anything.
Commit 4-5
So these commits are related, therefore I have to cover them together. First, part of this just surpresses a pointless "warning" that "warns" that there's a cool shiny thing that we could use, but don't need. Second, appears to be (I'm just a guy, not the maintainer, or a Python expert) prefetching a dependency for the webui, and maybe some mild code cleanup, Again, nothing here could break anything.
Commit 6
That same prefetch from commit 5, but applied to the dedicated SD notebook.
Commit 7
More guesswork needed here. It seems that all this change does is tell the webui to not attempt to fetch the stuff that commit 5-6 already have gotten.
Commit 8
Commit 7, but for Dreambooth.
Commit 9
A fix for an error I could've sworn I've seen on the issues page before, but can't seem to find at the moment. 100% couldn't affect training, though.
Commit 10-12
Self explanitory, adds everything needed for ControlNet to work, and it's only applied to the Automatic1111 notebook, not the Dreambooth notebook.
Commit 13
Self explanitory, just fixes a problem with resuming training on 768 v2.x models on Dreambooth. Likely not what's causing people trouble here.
Commit 14
Uh, it's a 4 character change for what url to git clone. Nothing to see here.
Not trying to invalidate what people have said here about having trouble with overfitting, since I too have had a ton of trouble getting anything that's not either overfitted or ugly (Though I still don't fully get Dreambooth's settings, and I haven't trained any models in over a month), but nothing of significance has changed during the last 14 commits.
So what do you suggest? How do we train over custom models now? If it's always overfitting. I never had such problem since I started to use it. The last time I trained without any problem was 18feb, then on the next day it's all started
Something was definitely broken.
- Resuming training for SD2.1-768px based models was throwing an error.
- Training on SD2.1-512px did not crash, but the result was terrible. The upper 'loss' values varied around 0.5-0.6, which is normal. But the lower 'loss' values were about 1e-4, which is very strange for first training.
ps: I recently ran the training again with the same dataset and parameters. The 'loss' value varies between 1e-1 and 0.7. Looks like the problem is fixed.
So I tried it again today with 1e6 text encoder, 10 photos. It was a mess. Tried it now with the same photos and 4e7 and it's better
There is no standard settings for all datasets, you have to find the right settings for your dataset.
I think it's something with unet, it became more sensitive. Now I used 23 photos with 650 steps only, even text is 1e6 and it's ok, Usually I would use 2300steps for that..
Yes it became more efficient, so you don't need 4000 steps to train on a single subject
Yes it became more efficient, so you don't need 4000 steps to train on a single subject
Oh, If only you said that earlier 😄 Where we can read about those updates?
I have finally made it to work as before. Check out this notebook: https://github.com/Bullseye-StableDiffusion/fixed/blob/main/fast_DreamBooth_fixed.ipynb Took me some good hours to make it to work, because I'm no coder. Used my logical thinking to revert anything to a stable version. I don't know how to fork a certain commit, so I downloaded the older commit and uploaded on a new repository and modified everything accordingly.
Edit: No more distortions and weird images. Add the name of the .jpg files in the prompts to get more characteristics of your character.
Thanks mate! going to test it today
I have finally made it to work as before. Check out this notebook: https://github.com/Bullseye-StableDiffusion/fixed/blob/main/fast_DreamBooth_fixed.ipynb Took me some good hours to make it to work, because I'm no coder. Used my logical thinking to revert anything to a stable version. I don't know how to fork a certain commit, so I downloaded the older commit and uploaded on a new repository and modified everything accordingly.
Edit: No more distortions and weird images. Add the name of the .jpg files in the prompts to get more characteristics of your character.
THANKS!!! IT WORKS AMAZING! Finally, I was trying everything. But the new version always deform faces even with low steps.
Still training SD2.1 give terrible results.
What the hell does it look like? Undertraining / overtraining? UNet / text_encoder?
Hi, I have been training faces on Realistic Vision model for about a week and results were always good, but today something is wrong, it seems that after loading trained model to stable diffusion it just generates photos based on my instance images and prompts wont work as it was before. I thought that I overtrained it, tried again with 25 images and lesser steps same results, even tried with 10 pics and still the same. I usually train train face with 30-70 photos with 5e6 unet training steps and 1e6 Text Encoder Learning Rate, yesterday I tried with 2e-6 unet and 1e-6 text and results were amazing. Today it just seems broken, even tried to train face over another model, still the same. Anyone had the same problem? Sorry for my eng :)
This has been happening to me as well. I trained my model despite working fine a few days ago. These are my results now. This is supposed to be a tiger eating fish in a jungle.
Guys, what are your base models? With SD1.5, it probably still works correctly. With SD2.1 (512 or 768), it gets terrible results. Moreover, the larger the dataset size, the worse the resulting model. The picture I gave above is generated by a model trained on 180 photos.
Hi, I have been training faces on Realistic Vision model for about a week and results were always good, but today something is wrong, it seems that after loading trained model to stable diffusion it just generates photos based on my instance images and prompts wont work as it was before. I thought that I overtrained it, tried again with 25 images and lesser steps same results, even tried with 10 pics and still the same. I usually train train face with 30-70 photos with 5e6 unet training steps and 1e6 Text Encoder Learning Rate, yesterday I tried with 2e-6 unet and 1e-6 text and results were amazing. Today it just seems broken, even tried to train face over another model, still the same. Anyone had the same problem? Sorry for my eng :)
This has been happening to me as well. I trained my model despite working fine a few days ago. These are my results now. This is supposed to be a tiger eating fish in a jungle.
Just use less unet steps and text, it became more sensetive somehow
It seems to me that the issue is not about the number of steps. In the last example, I used only 30 steps per image. The result is still terrible.
Guys, what are your base models? With SD1.5, it probably still works correctly. With SD2.1 (512 or 768), it gets terrible results. Moreover, the larger the dataset size, the worse the resulting model. The picture I gave above is generated by a model trained on 180 photos.
I'm using 2.1 (768) and its getting trash results, however 3 days ago it was perfectly fine with the same database and training.
Hi, I have been training faces on Realistic Vision model for about a week and results were always good, but today something is wrong, it seems that after loading trained model to stable diffusion it just generates photos based on my instance images and prompts wont work as it was before. I thought that I overtrained it, tried again with 25 images and lesser steps same results, even tried with 10 pics and still the same. I usually train train face with 30-70 photos with 5e6 unet training steps and 1e6 Text Encoder Learning Rate, yesterday I tried with 2e-6 unet and 1e-6 text and results were amazing. Today it just seems broken, even tried to train face over another model, still the same. Anyone had the same problem? Sorry for my eng :)
This has been happening to me as well. I trained my model despite working fine a few days ago. These are my results now. This is supposed to be a tiger eating fish in a jungle.
![]()
Just use less unet steps and text, it became more sensetive somehow
I will just rollback a few days I don't want to figure out this new way to train models.
Test this colab from @Bullseye-StableDiffusion - https://colab.research.google.com/github/Bullseye-StableDiffusion/fixed/blob/main/fast_DreamBooth_fixed.ipynb Just testing it right now