stable-diffusion icon indicating copy to clipboard operation
stable-diffusion copied to clipboard

optimized inpaint

Open thezveroboy opened this issue 3 years ago • 6 comments

is it possible to do optimized inpaint script based on sd1.4?

thezveroboy avatar Aug 25 '22 09:08 thezveroboy

Hi, I am currently working on that itself but the inpainting process is not very clear to me, do you have any references I can follow?

basujindal avatar Aug 25 '22 09:08 basujindal

The Glid-3-xl repo has such an implementation with the older latent-diffusion dataset: https://github.com/Jack000/glid-3-xl

Only issue is that it uses a model specific for in painting. May still posit some ideas nonetheless.

GucciFlipFlops1917 avatar Aug 25 '22 17:08 GucciFlipFlops1917

There is an implementation of inpainting in this Colab. It uses the Hugging Face diffusion pipline.

CasvandenBogaard avatar Aug 25 '22 18:08 CasvandenBogaard

Inpainting in SD's own code is (for right now) very basic. You have to download a 3.1 GB model file specifically for it, have a directory with the image(s) you want to run it on with a _mask image for each one (all black except erased in the parts you want to replace, data/inpainting-examples has examples), and then run scripts/inpaint.py --indir "input directory" --outdir "output directory" --steps (defaults at 50). There's no way to add a prompt or any other options + you'll probably have to run it multiple times to get decent results.

There's a better version coming that'll let you prompt + probably be a lot smarter, right now it's basically just smart filling and doesn't always give good results.

TheEnhas avatar Aug 26 '22 16:08 TheEnhas

Can we use the same weights for inpainting in theory (with some changes in the code), or do we need to train another model?

basujindal avatar Aug 26 '22 17:08 basujindal

No idea, I'm not a coder in any way lol, the SD weights maybe might work for it with some changes, but you can't just drop the SD weight file in there. It's completely different.

Actual inpainting (with prompts) that I'm seeing elsewhere seems to be a variation of img2img or something and not really like this at all, an example is this video: https://twitter.com/wbuchw/status/1563162131024920576

TheEnhas avatar Aug 26 '22 17:08 TheEnhas