diffusers Separating legacy inpaint pipeline

The new fine-tuned inpaint pipeline in 0.6.0 release is very good, but its UNet architecture is different from original pipeline. I think the inpaint pipeline in 0.5.1 release has some advantage such as sharing the same model with txt2img on GPU or inheriting DreamBooth UNet trained on txt2img model. I actually often use DreamBooth model for inpainting, and that's a pretty big advantage. So, how about add legacy inpaint pipeline so that people can still use fine-tuned UNet to inpaint?

Also, I suggest strength parameter in the new inpaint pipeline. It can be implemented by fixing the mask input and diffuse the image channels only, just like img2img pipeline.

Oct 20 '22 05:10 juno-hwang

Hey @juno-hwang,

We still support the old "inpaint" pipeline it's just moved to "Legacy" now: https://github.com/huggingface/diffusers/blob/8be48507a0ea4c9e6771fbfbac1333ee7bb90993/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint_legacy.py#L46

Happy to not deprecate it if you want :-)

Oct 20 '22 17:10 patrickvonplaten

Hello @patrickvonplaten ,

I suppose you're refering to use the StableDiffusionInpaintPipelineLegacy class with the old weights (CompVis/stable-diffusion-v1-4) because if you want to test the new Inpaint pipeline (0.6.0 release) with the new weights (runwayml/stable-diffusion-inpainting) you get the following error:

RuntimeError: Given groups=1, weight of size [320, 9, 3, 3], expected input[2, 4, 64, 64] to have 9 channels, but got 4 channels instead

Meaning that the architecture has changed... So at the end, as @juno-hwang suggested, it would be great to add strength parameter in the new inpaint pipeline with the new weights (runwayml/stable-diffusion-inpainting).

Oct 21 '22 10:10 MartiGrau

Hey @MartiGrau,

Currently we don't have enough time to do the PR ourselves, but if you think it makes sense to add a strength parameter, I'm more than happy to review a PR :-)

Oct 24 '22 20:10 patrickvonplaten

Hey @patrickvonplaten 👋

Are there currently any plans to also support Dreambooth training (https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth.py) with the new inpainting pipeline (https://huggingface.co/runwayml/stable-diffusion-inpainting)?

If not, I would be happy to try to implement this if this makes sense.

Oct 26 '22 14:10 ulmewennberg

Think this could definitely make sense @patil-suraj what do you think? :-)

Oct 26 '22 16:10 patrickvonplaten

Hey @ulmewennberg that would be awesome, feel free to work on the PR and let us know if you any questions, happy to help :)

@MartiGrau , the strength argument is not really needed here. It's used in the legacy inpainting pipeline as it's based on the image2image example, where we provide an initial image and the strength argument decides how much noise to add to the image (which also control the number of inference timestep). Which is not the case here, as we don't use the image as an init_image the image is converted into latents and directly passed to the model along with the mask.

Oct 26 '22 17:10 patil-suraj

Hey @patil-suraj, as you said strength is not necessary, but I think introducing strength could allow us to re-paint the given region while keeping some consistency just like legacy inpainting pipeline. KakaoTalk_20220710_223619816 Also I found an intereting phenomena on img2img pipeline : Theoretically strength=1 should be equivalent with txt2img because it means the original is fully diffused into perfect Gaussian distribution $x_T$, but somewhy it is different and it keeps the original image's information. The figure above shows the difference between txt2img and img2img with strength=1 started from plain white image. I guess it's due to improper index shift on scheduler. Futhermore legacy inpainting pipeline is not good at seamless inpainting with high strength. For this reason, it would be very useful for image editing if we have new inpainting pipeline with proper strength implementation.

Oct 28 '22 08:10 juno-hwang

The figure above shows the difference between txt2img and img2img with strength=1 started from plain white image. I guess it's due to improper index shift on scheduler.

I see, thanks for reporting, will take a look at this.

Also, as I said above, we don't use init_image in inpainting, as we don't add any noise to the image and mask we pass for in-painting, because of this it's not possible to add strength here.

However, to keep consistency, what I could suggest is: get the in-painted image from the pipeline and then replace the non-masked parts of the image with the original image. That could achieve what you are looking for.

Oct 28 '22 09:10 patil-suraj

Awesome, @patil-suraj and @patrickvonplaten! Some questions so far on the implementation:

"For inpainting, the UNet has 5 additional input channels (4 for the encoded masked-image and 1 for the mask itself)"

Should I add noise to all the 9 input channels? Or perhaps not add it to the mask?
The output has only 4 channels. How do I compute the loss between a 4-dimensional noise_pred and a 9-dimensional input? Should I blend the two fourdimensional tensors by using the mask in some way, or for example just take the first 4 dimensions?

Oct 29 '22 12:10 ulmewennberg

@ulmewennberg I'm also working on it myself, from the inpainting pipeline it seem to only use noise on the image latents

Oct 29 '22 18:10 thedarkzeno

I made a pull request here But it's not passing the tests, can someone guide me to fix it? It's my first contribution, not sure how to do it right @ulmewennberg @patil-suraj

Nov 01 '22 00:11 thedarkzeno

Hey @ulmewennberg !

Should I add noise to all the 9 input channels? Or perhaps not add it to the mask? The noise is added to the target image that we want the model to generate.

The output has only 4 channels. How do I compute the loss between a 4-dimensional noise_pred and a 9-dimensional input?

We add noise to the target encoded image which has 4 channels, and the loss is simply computed against that noise.

Hope this makes it clear.

Nov 02 '22 12:11 patil-suraj

Thanks for the PR @thedarkzeno ! will take a look.

Nov 02 '22 12:11 patil-suraj

Hey @thedarkzeno and @ulmewennberg , actually not sure if I understand this. What's the goal of doing dream booth with in-painting ? How would it work ?

Nov 02 '22 12:11 patil-suraj

@patil-suraj the goal is to be able to for example take perform image-inpainting using custom concepts.

Nov 02 '22 13:11 ulmewennberg

I see, so you mean in-paint the custom concepts in the masked part ?

Nov 02 '22 14:11 patil-suraj

@patil-suraj yep, this is correct

Nov 02 '22 23:11 ulmewennberg

Hello @patrickvonplaten ,

I suppose you're refering to use the StableDiffusionInpaintPipelineLegacy class with the old weights (CompVis/stable-diffusion-v1-4) because if you want to test the new Inpaint pipeline (0.6.0 release) with the new weights (runwayml/stable-diffusion-inpainting) you get the following error:

RuntimeError: Given groups=1, weight of size [320, 9, 3, 3], expected input[2, 4, 64, 64] to have 9 channels, but got 4 channels instead

Meaning that the architecture has changed... So at the end, as @juno-hwang suggested, it would be great to add strength parameter in the new inpaint pipeline with the new weights (runwayml/stable-diffusion-inpainting).

I'm also receiving the same exact same error using the latest 0.7.2 diffusers library, StableDiffusionInpaintPipeline, and the latest runwayml/stable-diffusion-inpainting model. Is there anything I have to change on my end to fix this?

Nov 07 '22 11:11 Mystfit

@Mystfit

The StableDiffusionInpaintPipeline should work with runwayml/stable-diffusion-inpainting model. Could you please post a code snippet so we could take a look ?

Nov 08 '22 13:11 patil-suraj

@Mystfit

The StableDiffusionInpaintPipeline should work with runwayml/stable-diffusion-inpainting model. Could you please post a code snippet so we could take a look ?

I think I've tracked down the cause of the error in my case. I'm using the StableDiffusionLongPromptWeightingPipeline (lpw_stable_diffusion) custom pipeline to add prompt weighting support and it doesn't seem to play well with the inpainting model or pipeline. When I remove the custom pipeline, then the error goes away.

Nov 08 '22 21:11 Mystfit

@thedarkzeno I'm currently experimenting with your code a bit, did you manage to get good results with it? any chance you can provide the flags you used?

Nov 17 '22 16:11 alicranck

@thedarkzeno what did you use for pretrained_model_name_or_path ("runwayml/stable-diffusion-inpainting"?). I'm trying to launch the code, but it looks like it fails with "with_prior_preservation mode"

Nov 17 '22 23:11 opetliak

Hi everyone, I haven't tested it extensively but got some interesting results, like this one: toycat It's still not perfect, but I'll see what I can do to improve. @opetliak Yes, I used the runwayml/stable-diffusion-inpainting, the prior preservation is broken, but I'll try to fix it

Nov 18 '22 14:11 thedarkzeno

Very cool!

Nov 20 '22 19:11 patrickvonplaten

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Dec 15 '22 15:12 github-actions[bot]

Hey @juno-hwang,

We still support the old "inpaint" pipeline it's just moved to "Legacy" now:

https://github.com/huggingface/diffusers/blob/8be48507a0ea4c9e6771fbfbac1333ee7bb90993/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint_legacy.py#L46

Happy to not deprecate it if you want :-)

Hi @patrickvonplaten patrickvonplaten,

I see that the legacy pipeline is being deprecated. I work regularly with older models, and I find this very useful. Would it be possible to NOT deprecate this pipeline?

If you absolutely have to, what are my options to keep using this pipeline?

Aug 18 '23 12:08 balasuh

diffusers diffusers copied to clipboard

Separating legacy inpaint pipeline

diffusers
diffusers copied to clipboard