LLM-groundedDiffusion icon indicating copy to clipboard operation
LLM-groundedDiffusion copied to clipboard

Do you consider compatibility with stable diffusion XL?

Open lossfaller opened this issue 1 year ago • 9 comments

Do you consider compatibility with stable diffusion XL?

lossfaller avatar Aug 07 '23 12:08 lossfaller

Since this work is training-free, it should be compatible.

TonyLianLong avatar Aug 07 '23 16:08 TonyLianLong

We have the same needs, mainly because XL changes the structure of Unet and uses two text-encoder, so some changes still need to be made. If the author can be compatible with the XL model, that would be great. @TonyLianLong

alleniver avatar Oct 12 '23 08:10 alleniver

Now the community basically uses the XL model, because the performance improvement is very obvious. In addition, I think your algorithm is still very useful. I hope it can be adapted. Thank you!

alleniver avatar Oct 12 '23 08:10 alleniver

Thanks for the suggestions! I'm working on that.

TonyLianLong avatar Oct 12 '23 16:10 TonyLianLong

I just added an SDXL implementation integration. You can pull the repo and use --sdxl when you call generate.py. Can you try whether it works on your end? @alleniver @lossfaller

TonyLianLong avatar Oct 28 '23 05:10 TonyLianLong

I just added an SDXL implementation integration. You can pull the repo and use --sdxl when you call generate.py. Can you try whether it works on your end? @alleniver @lossfaller

I think --sdxl only use refiner. Can we get sdxl image generation?

If you get time to implement it, please implement it with LCM, LCM is the future of Image generation in SDXL.

https://huggingface.co/blog/lcm_lora#how-to-train-lcm-loras

rdcoder33 avatar Nov 12 '23 02:11 rdcoder33

I just added an SDXL implementation integration. You can pull the repo and use --sdxl when you call generate.py. Can you try whether it works on your end? @alleniver @lossfaller

I think --sdxl only use refiner. Can we get sdxl image generation?

If you get time to implement it, please implement it with LCM, LCM is the future of Image generation in SDXL.

https://huggingface.co/blog/lcm_lora#how-to-train-lcm-loras

You're right. However, from my personal experience, using SD base model + SDXL refiner gives on-par results with SDXL base model + refiner. Not to mention that you can load training-based adapters (e.g., LMD+).

As for LCM as a replacement for SDXL, it might work if you load the LoRA weights and use LCM refiner rather than SDXL refiner, if that is supported.

TonyLianLong avatar Nov 12 '23 02:11 TonyLianLong

First of all, thank you for your work @TonyLianLong! This is an amazing project.

What's the current state of SDXL compatibility? As far as I can tell, only upscaling with the SDXL refiner model is currently supported. I'd love to use it with an SDXL base model, but this doesn't seem possible yet. Here's the error I'm getting:

RuntimeError: The expanded size of the tensor (2048) must match the existing size (768) at non-singleton dimension 1.  Target sizes: [2, 2048].  Tensor sizes: [2, 768]

I'd love to help draft a PR if you can guide me in the right direction.

MaxGfeller avatar Jan 01 '24 17:01 MaxGfeller

Currently, SDXL is supported through the refiner (--sdxl). SDXL base model support is possible (I have implemented that in another codebase before) but not implemented in this public codebase, as I found the performance with SDXL base model similar to using SDXL refiner on LMD+ (with --sdxl). If you're interested in implementing it, you can definitely draft a PR.

A difference between SDXL and SD v1/v2 is that it has two clip embeddings as conditions, and it has some other embeddings indicating the crop if I recall correctly. These need to be handled differently (so requires some work), but the model does work within our framework.

TonyLianLong avatar Jan 01 '24 17:01 TonyLianLong