diffusers
diffusers copied to clipboard
training sd with dreambooth/textual inversion with two different class of images
Describe the bug
Hi,
I am following stable diffusion training with the dreambooth example at local machine. Although the following link gives every details about training sd, however I am unable to identify utilizing this example when I have two or more different classes of images.: https://github.com/huggingface/diffusers/tree/main/examples/dreambooth#training-on-a-16gb-gpu
For example, Instead of using only dog toy example, the user also want to include cat toy example for single training script.
Reproduction
No response
Logs
No response
System Info
-
diffusers
version: 0.9.0 - Platform: Linux-5.10.133+-x86_64-with-Ubuntu-18.04-bionic
- Python version: 3.7.15
- PyTorch version (GPU?): 1.12.1+cu113 (True)
- Huggingface_hub version: 0.11.1
- Transformers version: 4.24.0
- Using GPU in script?: (True)
- Using distributed or parallel set-up in script?: No
cc @patil-suraj here
Think this is the same issue as https://github.com/huggingface/diffusers/issues/752 no?
This one is probs a bit harder @williamberman - @patil-suraj maybe you can give some guidance here
Think this is the same issue as #752 no?
yes this is similar to #752.
We have a script that potentially can manage multiple subjects. We'll see how good the results are before updating the existing script in the repo but don't have an exact timeline when will have that tested! Thank you for bearing with us @hamzafar :)
Thanks for sharing update @williamberman. It would wonderful if the script also takes multiple prompts(list).
This would be amazing, subscribing to this issue!
@williamberman Any update on this ?
Hey folks! Sorry no update as it's not been terribly high priority for us. Will hopefully get some time to work on it after the new year :)
Interested in this feature as well, hopefully we can see an update soon!
@patil-suraj what do you think here? I feel like in such a case it might make sense to directly do fine-tuning as explained here: https://github.com/huggingface/diffusers/tree/main/examples/text_to_image
One such script is available for dreambooth here https://github.com/huggingface/diffusers/tree/main/examples/research_projects/multi_subject_dreambooth
Would anyone be interested in adding such a script for textual inversion under /examples/research_projects/
?
We want to keep the main examples simple and easy to follow so many users can read and easily modify them for their tasks. The goal for those scripts is to be a point reference rather than providing all the features. That's why we put such other scripts under /examples/research_projects/
directory. I hope this makes sense :)
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
I've tried this sd implementation which supports mult-subject dreambooth (youtube tuts). Here is the complete guide about how to use it, docs. Usually their data format looks something like as follows:
# For two classes and their respected indentifier (`sks` and `efs`).
sks (1).png
sks (2).png
sks (3).png
...
efs (1).png
efs (2).png
...
(Could some reopen this issue?) cc. @patil-suraj @patrickvonplaten
Hey @innat can you please open a new issue with an exact error description? :-)