diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

The original model and the diffusers model do not give the same results

Open dingjingzhen opened this issue 2 years ago • 3 comments

When the original model and the diffusers model are run separately img2img produces different results with the same seed

dingjingzhen avatar Nov 30 '22 11:11 dingjingzhen

this happens when converting models to a different format. .ckpt is different from diffusers format, where the whole model is a group of folders separated into their components.

conversion to onnx format or another (openvino, ncnn ) means you need to map entirely new seeds for your model format.

ClashSAN avatar Nov 30 '22 12:11 ClashSAN

this happens when converting models to a different format. .ckpt is different from diffusers format, where the whole model is a group of folders separated into their components.

I know that when I convert the original model to a diffusers model via the script provided by diffusers, the results stay consistent at txt2img, but not at img2img, and since my model is trained with the original code, but I want to use diffusers for inference, this issue is still important to me

dingjingzhen avatar Nov 30 '22 12:11 dingjingzhen

oh, sorry, I guess I never tried .ckpt to diffusers

ClashSAN avatar Nov 30 '22 12:11 ClashSAN

Hey @dingjingzhen,

Could you maybe copy-paste a reproducible code snippet here or a google colab? It's probably related to the random seed - we made sure that our outputs match the outputs of CompVis

patrickvonplaten avatar Dec 01 '22 17:12 patrickvonplaten

Thank you very much for your reply @patrickvonplaten : This is how I ran the original model:

python scripts/img2img.py --prompt "A fantasy landscape, trending on artstation" --init-img ./sketch-mountains-input.jpg --strength 0.75 --seed 42

diffusers model:

import requests
import torch
from PIL import Image
from io import BytesIO
import os
from diffusers import StableDiffusionImg2ImgPipeline

# load the pipeline
device = "cuda"
model_id_or_path = "./diffusers-models"
pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
    model_id_or_path,
)
pipe = pipe.to(device)
from diffusers import DDIMScheduler
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
assert os.path.isfile('./sketch-mountains-input.jpg')
image = Image.open('./sketch-mountains-input.jpg').convert("RGB")
prompt = "A fantasy landscape, trending on artstation"
generator = torch.Generator("cuda").manual_seed(42)
images = pipe(prompt=prompt, generator= generator,init_image=image, strength=0.75, guidance_scale=5.0).images

images[0].save("fantasy_landscape.png")

The diffusers source code has been partially modified from the original code, and the input images have been kept consistent.

But the output is not consistent, can you help me to see what is the reason? Or can you provide me with a demo that gives consistent output?

dingjingzhen avatar Dec 02 '22 02:12 dingjingzhen

@patrickvonplaten I have compared the output of each step of the code and found that it is inconsistent after the prepare_latents, I haven't found the reason for this yet, I don't know if I am doing something wrong

dingjingzhen avatar Dec 02 '22 02:12 dingjingzhen

@patrickvonplaten hi,Is there a problem with my usage?

dingjingzhen avatar Dec 05 '22 09:12 dingjingzhen

Hey @dingjingzhen,

Thanks for opening the issue! I'm looking into it currently. Some tips on how to make the issue a bit easier to follow:

  • Always provide all the necessary paths and checkpoints. Your code snippet as is above cannot be run. The following is better:
#!/usr/bin/env python3
import torch
from PIL import Image
from diffusers import StableDiffusionImg2ImgPipeline
from diffusers import DDIMScheduler

image_path = "./sketch-mountains-input.jpg"

# load the pipeline
device = "cuda"
model_id_or_path = "CompVis/stable-diffusion-v1-4"
pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
    model_id_or_path,
)
pipe = pipe.to(device)
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)

image = Image.open(image_path).convert("RGB")

prompt = "A fantasy landscape, trending on artstation"
generator = torch.Generator("cuda").manual_seed(42)
images = pipe(
    prompt=prompt,
    generator=generator,
    init_image=image,
    strength=0.75,
    guidance_scale=5.0,
).images

images[0].save("fantasy_landscape.png")

with "./sketch-mountains-input.jpg" being: https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg

2.) Your CompVis command is also not correct - it produces two images instead of one. Instead the command should be as follows:

python scripts/img2img.py --prompt "A fantasy landscape, trending on artstation" --init-img ./sketch-mountains-input.jpg --strength 0.75 --seed 42 --n_samples 1

patrickvonplaten avatar Dec 10 '22 15:12 patrickvonplaten

Ok that took me a while :sweat_smile: , I'm quite convinced that there is a bug in the original CompVis img2img script - see: https://github.com/CompVis/stable-diffusion/pull/533

Let's wait what CompVis says.

patrickvonplaten avatar Dec 10 '22 18:12 patrickvonplaten

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Jan 04 '23 15:01 github-actions[bot]