threestudio Stable Diffusion XL

Hi, has anyone been experimenting with Stable Diffusion XL? I've tried a trivial solution of changing the dreamvision-sd config file to point to the SDXL weights, but I reckon that the loading of weights is somehow changed in addition to the model weights themselves.

For example, I get the following warnings when loading SDXL in the stable_diffusion_guidance.py file:

The config attributes {'add_watermarker': None} were passed to StableDiffusionXLPipeline, but are not expected and will be ignored. Please verify your model_index.json configuration file.
Keyword arguments {'add_watermarker': None, 'safety_checker': None, 'feature_extractor': None, 'requires_safety_checker': False} are not expected by StableDiffusionXLPipeline and will be ignored.
The config attributes {'force_upcast': True} were passed to AutoencoderKL, but are not expected and will be ignored. Please verify your config.json configuration file.

And an error upon beggining the iterations:

{...}
  File "/home/latent/threestudio/threestudio/models/guidance/stable_diffusion_guidance.py", line 415, in __call__
    grad, guidance_eval_utils = self.compute_grad_sds(
  File "/home/latent/threestudio/threestudio/models/guidance/stable_diffusion_guidance.py", line 242, in compute_grad_sds
    noise_pred = self.forward_unet(
  File "/home/latent/miniconda3/envs/threestudio/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
    return func(*args, **kwargs)
  File "/home/latent/threestudio/threestudio/models/guidance/stable_diffusion_guidance.py", line 154, in forward_unet
    return self.unet(
  File "/home/latent/miniconda3/envs/threestudio/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/latent/miniconda3/envs/threestudio/lib/python3.10/site-packages/diffusers/models/unet_2d_condition.py", line 839, in forward
    if "text_embeds" not in added_cond_kwargs:
TypeError: argument of type 'NoneType' is not iterable

Any help will be greatly appreciated.

Sep 02 '23 16:09 benquick123

yes, I have done sufficient experiments with SDXL successfully. Maybe I will share this if threestudio does not try SDXL.

Sep 03 '23 07:09 mdarhdarz

@mdarhdarz It would be great if you could share your implementation!

Sep 03 '23 15:09 thuliu-yt16

Some basic suggestions:

Use fp16 fixed vae
Prepare gpus with more than 32G vram
Text encoder and text embeddings are different from sd1.5, and unet forward needs more parameters. The pipeline implementation in diffusers is a good reference.

When you can run with SDXL, you still may not be able to get a good 3d generation result. I will write this part down and find a way to make it public. BTW, my implementation is based on Stable-dreamfusion 2022/12 version and I made a lot of modifications in the past year.

Sep 04 '23 05:09 mdarhdarz

@mdarhdarz Great job! I also tried SDXL in threestudio, but I found that even with FP6-Fixed-VAE, there is still a divergence phenomenon during 3D generation. I suspect that SDXL's VAE is not very capable of gradient backward. I don't know how you solved it.

Sep 04 '23 08:09 YG256Li

@YG256Li Yeah I also facing the same issue where the 3D shape cannot converge when switching to SDXL. Moreover, when I use fp16 precision, sometimes the VAE will output nan value. Would you mind sharing how to fix these issues? Thanks a lot! @mdarhdarz

Oct 13 '23 06:10 lzqsd

@YG256Li Yeah I also facing the same issue where the 3D shape cannot converge when switching to SDXL. Moreover, when I use fp16 precision, sometimes the VAE will output nan value. Would you mind sharing how to fix these issues? Thanks a lot! @mdarhdarz

I'm also waiting...

Oct 13 '23 13:10 mdarhdarz

Hope to see more of this one day :D

Mar 02 '24 10:03 g-l-i-t-c-h-o-r-s-e

Hope to see more of this one day :D

@g-l-i-t-c-h-o-r-s-e Here: https://github.com/fudan-zvg/PGC-3D

Mar 02 '24 11:03 mdarhdarz

threestudio threestudio copied to clipboard

Stable Diffusion XL

threestudio
threestudio copied to clipboard