threestudio icon indicating copy to clipboard operation
threestudio copied to clipboard

Stable Diffusion XL

Open benquick123 opened this issue 11 months ago • 8 comments

Hi, has anyone been experimenting with Stable Diffusion XL? I've tried a trivial solution of changing the dreamvision-sd config file to point to the SDXL weights, but I reckon that the loading of weights is somehow changed in addition to the model weights themselves.

For example, I get the following warnings when loading SDXL in the stable_diffusion_guidance.py file:

The config attributes {'add_watermarker': None} were passed to StableDiffusionXLPipeline, but are not expected and will be ignored. Please verify your model_index.json configuration file.
Keyword arguments {'add_watermarker': None, 'safety_checker': None, 'feature_extractor': None, 'requires_safety_checker': False} are not expected by StableDiffusionXLPipeline and will be ignored.
The config attributes {'force_upcast': True} were passed to AutoencoderKL, but are not expected and will be ignored. Please verify your config.json configuration file.

And an error upon beggining the iterations:

{...}
  File "/home/latent/threestudio/threestudio/models/guidance/stable_diffusion_guidance.py", line 415, in __call__
    grad, guidance_eval_utils = self.compute_grad_sds(
  File "/home/latent/threestudio/threestudio/models/guidance/stable_diffusion_guidance.py", line 242, in compute_grad_sds
    noise_pred = self.forward_unet(
  File "/home/latent/miniconda3/envs/threestudio/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
    return func(*args, **kwargs)
  File "/home/latent/threestudio/threestudio/models/guidance/stable_diffusion_guidance.py", line 154, in forward_unet
    return self.unet(
  File "/home/latent/miniconda3/envs/threestudio/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/latent/miniconda3/envs/threestudio/lib/python3.10/site-packages/diffusers/models/unet_2d_condition.py", line 839, in forward
    if "text_embeds" not in added_cond_kwargs:
TypeError: argument of type 'NoneType' is not iterable

Any help will be greatly appreciated.

benquick123 avatar Sep 02 '23 16:09 benquick123

yes, I have done sufficient experiments with SDXL successfully. Maybe I will share this if threestudio does not try SDXL.

mdarhdarz avatar Sep 03 '23 07:09 mdarhdarz

@mdarhdarz It would be great if you could share your implementation!

thuliu-yt16 avatar Sep 03 '23 15:09 thuliu-yt16

Some basic suggestions:

  1. Use fp16 fixed vae
  2. Prepare gpus with more than 32G vram
  3. Text encoder and text embeddings are different from sd1.5, and unet forward needs more parameters. The pipeline implementation in diffusers is a good reference.

When you can run with SDXL, you still may not be able to get a good 3d generation result. I will write this part down and find a way to make it public. BTW, my implementation is based on Stable-dreamfusion 2022/12 version and I made a lot of modifications in the past year.

mdarhdarz avatar Sep 04 '23 05:09 mdarhdarz

@mdarhdarz Great job! I also tried SDXL in threestudio, but I found that even with FP6-Fixed-VAE, there is still a divergence phenomenon during 3D generation. I suspect that SDXL's VAE is not very capable of gradient backward. I don't know how you solved it.

YG256Li avatar Sep 04 '23 08:09 YG256Li

@YG256Li Yeah I also facing the same issue where the 3D shape cannot converge when switching to SDXL. Moreover, when I use fp16 precision, sometimes the VAE will output nan value. Would you mind sharing how to fix these issues? Thanks a lot! @mdarhdarz

lzqsd avatar Oct 13 '23 06:10 lzqsd

@YG256Li Yeah I also facing the same issue where the 3D shape cannot converge when switching to SDXL. Moreover, when I use fp16 precision, sometimes the VAE will output nan value. Would you mind sharing how to fix these issues? Thanks a lot! @mdarhdarz

I'm also waiting...

mdarhdarz avatar Oct 13 '23 13:10 mdarhdarz

Hope to see more of this one day :D

g-l-i-t-c-h-o-r-s-e avatar Mar 02 '24 10:03 g-l-i-t-c-h-o-r-s-e

Hope to see more of this one day :D

@g-l-i-t-c-h-o-r-s-e Here: https://github.com/fudan-zvg/PGC-3D

mdarhdarz avatar Mar 02 '24 11:03 mdarhdarz