diffusers
diffusers copied to clipboard
Separating legacy inpaint pipeline
The new fine-tuned inpaint pipeline in 0.6.0 release is very good, but its UNet architecture is different from original pipeline. I think the inpaint pipeline in 0.5.1 release has some advantage such as sharing the same model with txt2img on GPU or inheriting DreamBooth UNet trained on txt2img model. I actually often use DreamBooth model for inpainting, and that's a pretty big advantage. So, how about add legacy inpaint pipeline so that people can still use fine-tuned UNet to inpaint?
Also, I suggest strength parameter in the new inpaint pipeline. It can be implemented by fixing the mask input and diffuse the image channels only, just like img2img pipeline.
Hey @juno-hwang,
We still support the old "inpaint" pipeline it's just moved to "Legacy" now: https://github.com/huggingface/diffusers/blob/8be48507a0ea4c9e6771fbfbac1333ee7bb90993/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint_legacy.py#L46
Happy to not deprecate it if you want :-)
Hello @patrickvonplaten ,
I suppose you're refering to use the StableDiffusionInpaintPipelineLegacy class with the old weights (CompVis/stable-diffusion-v1-4) because if you want to test the new Inpaint pipeline (0.6.0 release) with the new weights (runwayml/stable-diffusion-inpainting) you get the following error:
RuntimeError: Given groups=1, weight of size [320, 9, 3, 3], expected input[2, 4, 64, 64] to have 9 channels, but got 4 channels instead
Meaning that the architecture has changed... So at the end, as @juno-hwang suggested, it would be great to add strength parameter in the new inpaint pipeline with the new weights (runwayml/stable-diffusion-inpainting).
Hey @MartiGrau,
Currently we don't have enough time to do the PR ourselves, but if you think it makes sense to add a strength parameter, I'm more than happy to review a PR :-)
Hey @patrickvonplaten 👋
Are there currently any plans to also support Dreambooth training (https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth.py) with the new inpainting pipeline (https://huggingface.co/runwayml/stable-diffusion-inpainting)?
If not, I would be happy to try to implement this if this makes sense.
Think this could definitely make sense @patil-suraj what do you think? :-)
Hey @ulmewennberg that would be awesome, feel free to work on the PR and let us know if you any questions, happy to help :)
@MartiGrau , the strength argument is not really needed here. It's used in the legacy inpainting pipeline as it's based on the image2image example, where we provide an initial image and the strength argument decides how much noise to add to the image (which also control the number of inference timestep). Which is not the case here, as we don't use the image as an init_image the image is converted into latents and directly passed to the model along with the mask.
Hey @patil-suraj, as you said strength is not necessary, but I think introducing strength could allow us to re-paint the given region while keeping some consistency just like legacy inpainting pipeline.
Also I found an intereting phenomena on img2img pipeline : Theoretically strength=1 should be equivalent with txt2img because it means the original is fully diffused into perfect Gaussian distribution $x_T$, but somewhy it is different and it keeps the original image's information. The figure above shows the difference between txt2img and img2img with strength=1 started from plain white image. I guess it's due to improper index shift on scheduler.
Futhermore legacy inpainting pipeline is not good at seamless inpainting with high strength. For this reason, it would be very useful for image editing if we have new inpainting pipeline with proper strength implementation.
The figure above shows the difference between txt2img and img2img with strength=1 started from plain white image. I guess it's due to improper index shift on scheduler.
I see, thanks for reporting, will take a look at this.
Also, as I said above, we don't use init_image in inpainting, as we don't add any noise to the image and mask we pass for in-painting, because of this it's not possible to add strength here.
However, to keep consistency, what I could suggest is:
get the in-painted image from the pipeline and then replace the non-masked parts of the image with the original image. That could achieve what you are looking for.
Awesome, @patil-suraj and @patrickvonplaten! Some questions so far on the implementation:
"For inpainting, the UNet has 5 additional input channels (4 for the encoded masked-image and 1 for the mask itself)"
- Should I add noise to all the 9 input channels? Or perhaps not add it to the mask?
- The output has only 4 channels. How do I compute the loss between a 4-dimensional noise_pred and a 9-dimensional input? Should I blend the two fourdimensional tensors by using the mask in some way, or for example just take the first 4 dimensions?
@ulmewennberg I'm also working on it myself, from the inpainting pipeline it seem to only use noise on the image latents
I made a pull request here But it's not passing the tests, can someone guide me to fix it? It's my first contribution, not sure how to do it right @ulmewennberg @patil-suraj
Hey @ulmewennberg !
Should I add noise to all the 9 input channels? Or perhaps not add it to the mask? The noise is added to the target image that we want the model to generate.
The output has only 4 channels. How do I compute the loss between a 4-dimensional noise_pred and a 9-dimensional input?
We add noise to the target encoded image which has 4 channels, and the loss is simply computed against that noise.
Hope this makes it clear.
Thanks for the PR @thedarkzeno ! will take a look.
Hey @thedarkzeno and @ulmewennberg , actually not sure if I understand this. What's the goal of doing dream booth with in-painting ? How would it work ?
@patil-suraj the goal is to be able to for example take perform image-inpainting using custom concepts.
I see, so you mean in-paint the custom concepts in the masked part ?
@patil-suraj yep, this is correct
Hello @patrickvonplaten ,
I suppose you're refering to use the StableDiffusionInpaintPipelineLegacy class with the old weights (CompVis/stable-diffusion-v1-4) because if you want to test the new Inpaint pipeline (0.6.0 release) with the new weights (runwayml/stable-diffusion-inpainting) you get the following error:
RuntimeError: Given groups=1, weight of size [320, 9, 3, 3], expected input[2, 4, 64, 64] to have 9 channels, but got 4 channels insteadMeaning that the architecture has changed... So at the end, as @juno-hwang suggested, it would be great to add strength parameter in the new inpaint pipeline with the new weights (runwayml/stable-diffusion-inpainting).
I'm also receiving the same exact same error using the latest 0.7.2 diffusers library, StableDiffusionInpaintPipeline, and the latest runwayml/stable-diffusion-inpainting model. Is there anything I have to change on my end to fix this?
@Mystfit
The StableDiffusionInpaintPipeline should work with runwayml/stable-diffusion-inpainting model. Could you please post a code snippet so we could take a look ?
@Mystfit
The
StableDiffusionInpaintPipelineshould work withrunwayml/stable-diffusion-inpaintingmodel. Could you please post a code snippet so we could take a look ?
I think I've tracked down the cause of the error in my case. I'm using the StableDiffusionLongPromptWeightingPipeline (lpw_stable_diffusion) custom pipeline to add prompt weighting support and it doesn't seem to play well with the inpainting model or pipeline. When I remove the custom pipeline, then the error goes away.
@thedarkzeno I'm currently experimenting with your code a bit, did you manage to get good results with it? any chance you can provide the flags you used?
@thedarkzeno what did you use for pretrained_model_name_or_path ("runwayml/stable-diffusion-inpainting"?). I'm trying to launch the code, but it looks like it fails with "with_prior_preservation mode"
Hi everyone, I haven't tested it extensively but got some interesting results, like this one:
It's still not perfect, but I'll see what I can do to improve.
@opetliak Yes, I used the runwayml/stable-diffusion-inpainting, the prior preservation is broken, but I'll try to fix it
Very cool!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Hey @juno-hwang,
We still support the old "inpaint" pipeline it's just moved to "Legacy" now:
https://github.com/huggingface/diffusers/blob/8be48507a0ea4c9e6771fbfbac1333ee7bb90993/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint_legacy.py#L46
Happy to not deprecate it if you want :-)
Hi @patrickvonplaten patrickvonplaten,
I see that the legacy pipeline is being deprecated. I work regularly with older models, and I find this very useful. Would it be possible to NOT deprecate this pipeline?
If you absolutely have to, what are my options to keep using this pipeline?