diffusers examples/community/lpw_stable_diffusion

Describe the bug

When I used lpw_stable_diffusion_xl on a text2img model (playgroundai/playground-v2.5-1024px-aesthetic), I found the image gray, seemed not decoded correctly. Zero Two

To solve this I copied some codes from StableDiffusionXLPipeline and replaced line 1889.

# image = self.vae.decode(latents / self.vae.config.scaling_factor, return_dict=False)[0]  This is the original one 
has_latents_mean = hasattr(self.vae.config, "latents_mean") and self.vae.config.latents_mean is not None
has_latents_std = hasattr(self.vae.config, "latents_std") and self.vae.config.latents_std is not None
if has_latents_mean and has_latents_std:
    latents_mean = (
        torch.tensor(self.vae.config.latents_mean).view(1, 4, 1, 1).to(latents.device, latents.dtype)
    )
    latents_std = (
        torch.tensor(self.vae.config.latents_std).view(1, 4, 1, 1).to(latents.device, latents.dtype)
    )
    latents = latents * latents_std / self.vae.config.scaling_factor + latents_mean
else:
    latents = latents / self.vae.config.scaling_factor

image = self.vae.decode(latents, return_dict=False)[0]

This is a naiive fix, so I'm not sure whether it works in other cases.

Reproduction

from diffusers import DiffusionPipeline
import torch

pipe = DiffusionPipeline.from_pretrained(
    "playgroundai/playground-v2.5-1024px-aesthetic", # perhaps you can change to other text2img to reproduce?
    custom_pipeline = "lpw_stable_diffusion_xl",
    torch_dtype=torch.float16,
    variant="fp16",
).to("cuda")

prompt = "Create a detailed and refined image of Zero Two from the anime Darling in the Franxx. She is known for her distinctive pink hair and mesmerizing green eyes. She should be depicted in a dynamic pose, showcasing her strong and fearless personality. The image should be in anime style, with an 8k resolution and a 16:9 aspect ratio. The background should be a battlefield, symbolizing the constant fights she has to face. Despite the harsh environment, she maintains a confident and determined expression. The background should be black." # This one generated by copilot, don't focus on this:)

image = pipe(prompt=prompt, num_inference_steps=50, guidance_scale=3).images[0]

image.save("Zero Two.png")

Logs

No response

System Info

diffusers version: 0.27.0
Platform: Linux-5.15.0-102-generic-x86_64-with-glibc2.31
Python version: 3.10.13
PyTorch version (GPU?): 2.1.1 (True)
Huggingface_hub version: 0.21.3
Transformers version: 4.38.1
Accelerate version: 0.27.2
xFormers version: not installed
Using GPU in script?:
Using distributed or parallel set-up in script?:

Who can help?

No response

May 02 '24 17:05 HACLINE

Hi, thank you for reporting the issue.

Those changes are needed for playground-v2.5 to work. Since the community pipelines are maintained by the community, I think most of them won't work with that model right now until someone updates them (usually the original contributors).

Maybe you can tag them or you can open a PR yourself if you want to contribute.

May 03 '24 03:05 asomoza

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sep 14 '24 15:09 github-actions[bot]

diffusers
diffusers copied to clipboard

examples/community/lpw_stable_diffusion_xl.py Not correctly decoded

Describe the bug

Reproduction

Logs

System Info

Who can help?

diffusers diffusers copied to clipboard

examples/community/lpw_stable_diffusion_xl.py Not correctly decoded

Describe the bug

Reproduction

Logs

System Info

Who can help?

diffusers
diffusers copied to clipboard