threestudio icon indicating copy to clipboard operation
threestudio copied to clipboard

Is it possible to make SDXL guidance work?

Open lzqsd opened this issue 10 months ago • 1 comments

I still hope to check if anyone makes SDXL guidance work. I implemented one but it did not work. The major changes I made was the prompt process part as SDXL needs text embedding from two text encoders as well as pooled embedding and the Unet to encode image size information.

There are 2 weird things I noticed:

  1. If I set the data type of the VAE encoder to be float16, soon the gradient becomes nan, which is different from sd 2.0.
  2. If I set the data type of VAE encoder to be float32, I no longer have the nan issue but I can only get very blurry results at the beginning and it diverges as the training goes on.

Looking forward to any thoughts and suggestions! Thanks in advance!

lzqsd avatar Oct 26 '23 05:10 lzqsd

You can take some inspiration from the repo pinned in my profile.

mdarhdarz avatar Oct 27 '23 04:10 mdarhdarz