LLM-groundedDiffusion
LLM-groundedDiffusion copied to clipboard
Do you consider compatibility with stable diffusion XL?
Do you consider compatibility with stable diffusion XL?
Since this work is training-free, it should be compatible.
We have the same needs, mainly because XL changes the structure of Unet and uses two text-encoder, so some changes still need to be made. If the author can be compatible with the XL model, that would be great. @TonyLianLong
Now the community basically uses the XL model, because the performance improvement is very obvious. In addition, I think your algorithm is still very useful. I hope it can be adapted. Thank you!
Thanks for the suggestions! I'm working on that.
I just added an SDXL implementation integration. You can pull the repo and use --sdxl
when you call generate.py
. Can you try whether it works on your end? @alleniver @lossfaller
I just added an SDXL implementation integration. You can pull the repo and use
--sdxl
when you callgenerate.py
. Can you try whether it works on your end? @alleniver @lossfaller
I think --sdxl only use refiner. Can we get sdxl image generation?
If you get time to implement it, please implement it with LCM, LCM is the future of Image generation in SDXL.
https://huggingface.co/blog/lcm_lora#how-to-train-lcm-loras
I just added an SDXL implementation integration. You can pull the repo and use
--sdxl
when you callgenerate.py
. Can you try whether it works on your end? @alleniver @lossfallerI think --sdxl only use refiner. Can we get sdxl image generation?
If you get time to implement it, please implement it with LCM, LCM is the future of Image generation in SDXL.
https://huggingface.co/blog/lcm_lora#how-to-train-lcm-loras
You're right. However, from my personal experience, using SD base model + SDXL refiner gives on-par results with SDXL base model + refiner. Not to mention that you can load training-based adapters (e.g., LMD+).
As for LCM as a replacement for SDXL, it might work if you load the LoRA weights and use LCM refiner rather than SDXL refiner, if that is supported.
First of all, thank you for your work @TonyLianLong! This is an amazing project.
What's the current state of SDXL compatibility? As far as I can tell, only upscaling with the SDXL refiner model is currently supported. I'd love to use it with an SDXL base model, but this doesn't seem possible yet. Here's the error I'm getting:
RuntimeError: The expanded size of the tensor (2048) must match the existing size (768) at non-singleton dimension 1. Target sizes: [2, 2048]. Tensor sizes: [2, 768]
I'd love to help draft a PR if you can guide me in the right direction.
Currently, SDXL is supported through the refiner (--sdxl
). SDXL base model support is possible (I have implemented that in another codebase before) but not implemented in this public codebase, as I found the performance with SDXL base model similar to using SDXL refiner on LMD+ (with --sdxl
). If you're interested in implementing it, you can definitely draft a PR.
A difference between SDXL and SD v1/v2 is that it has two clip embeddings as conditions, and it has some other embeddings indicating the crop if I recall correctly. These need to be handled differently (so requires some work), but the model does work within our framework.