What does this PR do?

This pull request ports our denoising_start code to the text2img pipeline, and the denoising_start and denoising_end code from the img2img pipeline.

This brings legacy SD model capabilities in line with SDXL.

Enhances #4003

Example

Using on Stable Diffusion 2.1 fine-tuned with zero terminal SNR:

Before submitting

[ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[ ] Did you read the contributor guideline?
[ ] Did you read our philosophy doc (important for complex PRs)?
[ ] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
[ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
[ ] Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

Jul 29 '23 06:07 bghira

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Jul 29 '23 06:07 HuggingFaceDocBuilderDev

@patrickvonplaten making sure this one doesn't get lost

Aug 02 '23 14:08 bghira

Very cool addition! Could we maybe add some tests? Think they can be very similar to the ones we added here: https://github.com/huggingface/diffusers/blob/d0b8de1262ba785474fc9df53c29ba44ec02c715/tests/pipelines/stable_diffusion_xl/test_stable_diffusion_xl.py#L264 (feel free to copy-paste)

Aug 03 '23 19:08 patrickvonplaten

Let us know if you need any help with the tests or currently failing tests @bghira :-)

Aug 23 '23 21:08 patrickvonplaten

@patrickvonplaten sorry, i've been really busy testing SDXL training and haven't had time to follow-up here. i would be glad for the assist!

Aug 23 '23 21:08 bghira

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Oct 18 '23 15:10 github-actions[bot]

@patrickvonplaten

Oct 18 '23 15:10 bghira

@bghira for which SD 1.x / SD 2.x models does mixture of experts work well?

Oct 28 '23 18:10 patrickvonplaten

the result above is actually from passing the partially diffused output from SD 2.1, through SDXL.

Oct 28 '23 23:10 bghira

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Nov 22 '23 15:11 github-actions[bot]

@yiyixuxu WDYT?

Nov 27 '23 04:11 sayakpaul

I don't think we need to prio this PR at the moment

Nov 27 '23 13:11 patrickvonplaten

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Dec 26 '23 15:12 github-actions[bot]

come on

Jan 19 '24 02:01 bghira

@bghira sorry to be so late here. Could you maybe post a code snippet showing how this PR enables high quality images with Mixture Of Experts for SD 1.x and 2.x? Given that we don't have 1.x or 2.x denoiser checkpoint I'm a bit unsure whether this is needed yet to be honest.

Jan 19 '24 10:01 patrickvonplaten

my trainer can make them, and my 2.x checkpoints can make use of this. i showed images above. it sounds like you really just dont want this. you can reject it, people can just use some other library or toolkit to make it happen

Jan 19 '24 13:01 bghira

fwiw this even allows using the sdxl refiner to complete inference on sd 1.5 or 2.x

Jan 19 '24 14:01 bghira

fwiw this even allows using the sdxl refiner to complete inference on sd 1.5 or 2.x

Can you add a quick code snippet for this?

Jan 23 '24 11:01 patrickvonplaten

from diffusers import DiffusionPipeline
import torch

base = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-1", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda")

refiner = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-refiner-1.0",
    torch_dtype=torch.float16,
    use_safetensors=True,
    variant="fp16",
).to("cuda")

prompt = "A majestic lion jumping from a big stone at night"

base_image = base(
    prompt=prompt,
    num_inference_steps=40,
    denoising_end=0.8,
    output_type="pil",
).images[0]
image = refiner(
    prompt=prompt,
    num_inference_steps=40,
    denoising_start=0.8,
    image=base_image,
).images[0]
base_image.save('base_image.png', format='PNG')
image.save('image.png', format='PNG')

Jan 23 '24 18:01 bghira

to use a SD 2.x zero-terminal SNR checkpoint that is finetuned for steps 0-400:

from diffusers import DiffusionPipeline
import torch

base = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-1", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda")

refiner = DiffusionPipeline.from_pretrained(
    "ptx0/pseudo-flex-base",
    torch_dtype=torch.float16,
    use_safetensors=True,
).to("cuda")

prompt = "A majestic lion jumping from a big stone at night"

base_image = base(
    prompt=prompt,
    num_inference_steps=40,
    denoising_end=0.8,
    output_type="latent",
).images
image = refiner(
    prompt=prompt,
    num_inference_steps=40,
    denoising_start=0.8,
    image=base_image,
).images[0]
base_image.save('base_image.png', format='PNG')
image.save('image.png', format='PNG')

Jan 23 '24 19:01 bghira

I'm not getting great results with https://github.com/huggingface/diffusers/pull/4355#issuecomment-1906728628 and https://github.com/huggingface/diffusers/pull/4355#issuecomment-1906736194 doesn't work on this branch for me.

I'm ok adding it though, it makes sense to have this functionality when we already have some refiner model for SD2 1 trained.

Feb 09 '24 16:02 patrickvonplaten

@yiyixuxu could you give this a review?

Feb 09 '24 16:02 patrickvonplaten

ah, yeah to be fair i haven't pulled this branch in some time, it's not very easy for me to test this stuff locally as i'm in central america and downloading these models takes a very long time, if it completes at all.

once it is in though, i can revisit it and put some compute toward training a specific refiner for this case.

Feb 09 '24 17:02 bghira

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Mar 07 '24 15:03 github-actions[bot]

Not stale.

Mar 07 '24 16:03 sayakpaul

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Apr 02 '24 15:04 github-actions[bot]

@bghira do we still want to support this? happy to help with the PR but it only makes sense if we are going to have the checkpoints though

Apr 03 '24 03:04 yiyixuxu

yes but any sd model can be an expert. pick one for composition and one for details or style and split the job between them

Apr 03 '24 12:04 bghira

@bghira

would you be able to provide examples so we can make a doc page for this?

Code-wise, I think there's not much left to finish up. Do you want to finish it up, or would you prefer that I take this over this PR?

Apr 03 '24 23:04 yiyixuxu

@yiyixuxu actually i'll be able to put more effort forth on this soon, thanks for the patience

Apr 27 '24 13:04 bghira

diffusers
diffusers copied to clipboard

Mixture-of-Experts partial diffusion implementation for base SD 1.x / 2.x pipelines

What does this PR do?

Example

Before submitting

Who can review?

diffusers diffusers copied to clipboard

Mixture-of-Experts partial diffusion implementation for base SD 1.x / 2.x pipelines

What does this PR do?

Example

Before submitting

Who can review?

diffusers
diffusers copied to clipboard