threestudio
threestudio copied to clipboard
Stable Diffusion XL
Hi, has anyone been experimenting with Stable Diffusion XL? I've tried a trivial solution of changing the dreamvision-sd
config file to point to the SDXL weights, but I reckon that the loading of weights is somehow changed in addition to the model weights themselves.
For example, I get the following warnings when loading SDXL in the stable_diffusion_guidance.py
file:
The config attributes {'add_watermarker': None} were passed to StableDiffusionXLPipeline, but are not expected and will be ignored. Please verify your model_index.json configuration file.
Keyword arguments {'add_watermarker': None, 'safety_checker': None, 'feature_extractor': None, 'requires_safety_checker': False} are not expected by StableDiffusionXLPipeline and will be ignored.
The config attributes {'force_upcast': True} were passed to AutoencoderKL, but are not expected and will be ignored. Please verify your config.json configuration file.
And an error upon beggining the iterations:
{...}
File "/home/latent/threestudio/threestudio/models/guidance/stable_diffusion_guidance.py", line 415, in __call__
grad, guidance_eval_utils = self.compute_grad_sds(
File "/home/latent/threestudio/threestudio/models/guidance/stable_diffusion_guidance.py", line 242, in compute_grad_sds
noise_pred = self.forward_unet(
File "/home/latent/miniconda3/envs/threestudio/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
return func(*args, **kwargs)
File "/home/latent/threestudio/threestudio/models/guidance/stable_diffusion_guidance.py", line 154, in forward_unet
return self.unet(
File "/home/latent/miniconda3/envs/threestudio/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/latent/miniconda3/envs/threestudio/lib/python3.10/site-packages/diffusers/models/unet_2d_condition.py", line 839, in forward
if "text_embeds" not in added_cond_kwargs:
TypeError: argument of type 'NoneType' is not iterable
Any help will be greatly appreciated.
yes, I have done sufficient experiments with SDXL successfully. Maybe I will share this if threestudio does not try SDXL.
@mdarhdarz It would be great if you could share your implementation!
Some basic suggestions:
- Use fp16 fixed vae
- Prepare gpus with more than 32G vram
- Text encoder and text embeddings are different from sd1.5, and unet forward needs more parameters. The pipeline implementation in diffusers is a good reference.
When you can run with SDXL, you still may not be able to get a good 3d generation result. I will write this part down and find a way to make it public. BTW, my implementation is based on Stable-dreamfusion 2022/12 version and I made a lot of modifications in the past year.
@mdarhdarz Great job! I also tried SDXL in threestudio, but I found that even with FP6-Fixed-VAE, there is still a divergence phenomenon during 3D generation. I suspect that SDXL's VAE is not very capable of gradient backward. I don't know how you solved it.
@YG256Li Yeah I also facing the same issue where the 3D shape cannot converge when switching to SDXL. Moreover, when I use fp16 precision, sometimes the VAE will output nan value. Would you mind sharing how to fix these issues? Thanks a lot! @mdarhdarz
@YG256Li Yeah I also facing the same issue where the 3D shape cannot converge when switching to SDXL. Moreover, when I use fp16 precision, sometimes the VAE will output nan value. Would you mind sharing how to fix these issues? Thanks a lot! @mdarhdarz
I'm also waiting...
Hope to see more of this one day :D
Hope to see more of this one day :D
@g-l-i-t-c-h-o-r-s-e Here: https://github.com/fudan-zvg/PGC-3D