diffusers Severe difference with A1111

Describe the bug

I am using a custom model. On A1111 it is far more colorful than on diffusers. I am aware that it's impossible to replicate images between the 2 with the same input, but my observation is across many examples

here's results for diffusers with 1 prompt accross guidance scale:

same but for A1111:

I know what you may say. It's unscientific and all, but this is my experience accross multiple images, with controlnet and ip adapter and without. On A1111, it's consistently closer to the style while diffusers has less color and tries to be more realistic (also burns more frequently)

Reproduction

I can't give you a code snippet. It's just basic comparison with A1111 results for heavily stylized models.

Logs

No response

System Info

python

Who can help?

@DN6 @yiyixuxu sorry for the very vague issue, wish I could do better

May 02 '24 17:05 alexblattner

Hi, maybe you can't give code, but maybe the prompt, model and parameters? I can generate a lot of images but I won't know the difference with what you're doing.

Also I don't get your comparison, the diffusers example is a portrait of a man and the auto1111 is a woman with a portrait and half body mix, so you're not even using the same prompt? To be able to compare them, at least, you should fix all the parameters of the generation, even if they won't generate the same image.

Just using a low res image of what you generated with IP Adapters, I can get the saturation and style without problems.

Probably can do better if I had the prompt and the style you're using.

May 02 '24 18:05 asomoza

@asomoza you are correct. here's the model: https://drive.google.com/file/d/10GCQNP13YIuw8dX8zyAztUY-_lq5JH8m/view?usp=drive_link

prompt: "1boy, brown hair, waltz with bashir style, archer style" negative_prompt: "(worst quality, low quality),childlike, petite, loli," steps: 30 guidance_scale: 7.5 ip_scale: 1 ip_s_scale: 1 ip adapter: ip-adapter-faceid-plusv2_sd15.bin ip_image: Screenshot_2024-05-02_150906

the model is in diffusers so from_pretrained will work on it. I don't have it in a format for A1111 at the moment, but I doubt you would want to download the same model twice for that

May 03 '24 00:05 alexblattner

I'm kind of curious on how you tested the model with auto1111 if you don't have a compatible version, but anyways, I had my suspicion about it, most of the time you get those kind of images with SD 1.5 it's the vae.

So I just switched the vae and it worked, didn't even have to test with IP Adapters.

vae = AutoencoderKL.from_single_file(
    "https://huggingface.co/stabilityai/sd-vae-ft-mse-original/blob/main/vae-ft-mse-840000-ema-pruned.safetensors",
    torch_dtype=torch.float16,
).to("cuda")


pipe = StableDiffusionPipeline.from_pretrained("./models/poselabsv12", torch_dtype=torch.float16, vae=vae).to("cuda")

original	switched vae

So I recommend you switch your vae for a good one in the popular models, I tested it with this one which is not in diffusers format because I was testing another SD 1.5 pipeline.

May 03 '24 02:05 asomoza

I'll try that out. I'm working with someone using A1111 which is why I'm in my situation

May 03 '24 15:05 alexblattner

it was the issue, thank you!

May 04 '24 19:05 alexblattner

the real issue was that the strength of faceid is higher in diffusers than A1111, apply the lora at -.5 to fix it

Jul 08 '24 11:07 alexblattner

diffusers diffusers copied to clipboard

Severe difference with A1111

Describe the bug

Reproduction

Logs

System Info

Who can help?

diffusers
diffusers copied to clipboard