stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

Supporting for SDXL-Inpaint Model

Open wangqyqq opened this issue 1 year ago • 14 comments

Description

  • Supporting for SDXL-Inpaint Model #13195
  • Inpainting is a fundamental function in SD and it's better to use a model dedicated for inpainting. WebUI now support sd-1.5-inpaint model but not support sdxl-inpaint model. The effect of inpainting with sdxl-base model is not good. This PR is the support for sdxl-inpaint model.
  • The sdxl-inpaint model can be downloaded at wangqyqq/sd_xl_base_1.0_inpainting_0.1.safetensors. This model is originally released by diffusers at diffusers/stable-diffusion-xl-1.0-inpainting-0.1 with diffusers format and is converted to .safetensors by benjamin-paine. The former link is a copy for benjamin-paine's.

Screenshots/videos:

inpaint-examples-min

Checklist:

wangqyqq avatar Dec 21 '23 12:12 wangqyqq

You are a legend.

gel-crabs avatar Dec 21 '23 13:12 gel-crabs

when?

andzejsp avatar Dec 22 '23 11:12 andzejsp

when?

I've committed the code. you can check commits here https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/14390/commits/9feb034e343d6d7ef63395821658fb3774b30a24.

wangqyqq avatar Dec 22 '23 12:12 wangqyqq

THANK YOU

Ckle avatar Dec 22 '23 21:12 Ckle

I've been trying this out and it seems to work. The inpainting model isn't great, but it's better than nothing, and maybe now people will train some better ones.

mweldon avatar Dec 23 '23 02:12 mweldon

I've been trying this out and it seems to work. The inpainting model isn't great, but it's better than nothing, and maybe now people will train some better ones.

If you're using a different SDXL model than SDXL 1.0, you can create a custom inpainting model using the guide from here (https://huggingface.co/benjamin-paine/sd-xl-alternative-bases). This works perfectly, especially with soft inpainting.

gel-crabs avatar Dec 23 '23 17:12 gel-crabs

@wangqyqq thank you so much for doing this! It's been months of people asking me to bring support to Automatic and me having to disappoint them saying I didn't know how and didn't have the time to learn, so thank you again.

I wanted to just volunteer that ENFUGUE has slightly changed how XL inpainting is implemented since the initial code, and my findings may be useful to port to Auto or just be aware of.

First I have to highlight the 1.0 issue; if you're used to using 1.0 strength for inpainting as I was before this, there is a poorly understood issue with the XL inpainting model whereby using strength=1.0 produces very poor results. I'm not sure if this issue will carry over to Automatic but it's a big problem in diffusers.


ENFUGUE inpainting setup.

Cropped inpainting is enabled, so a 1024x1024 square around the inpainted area will be diffused, whereas the rest will be ignored. This will really show off the issue.


Left: 1.0 denoising strength, right: 0.99 denoising strength.

You can probably spot many issues with the 1.0 version that 0.99 doesn't have, but the most obvious to me is the color grading. There is a very visible square around the bench in the 1.0 version that is a significantly different color. Because of this, ENFUGUE will simply automatically apply reduce it to 0.99 if the user passes strength=1.0 and is using XL inpainting.

The second thing I wanted to call attention to is outpainting. These models can do pretty well at outpainting, but because of the issue above, the result will be poor if you initialize the empty space to black.


Left: ENFUGUE outpainting setup, right: the result when filling empty space with black.

To address this, ENFUGUE will fill the empty space with noise; I found the best results from Perlin noise clamped in the range (0.1, 0.9). There is probably an overall better noising algorithm that looks at the range of the rest of the image, but overall the raw results from outpainting models are still not as good as their non-outpainting counterparts, so I've always been running an upscaling, refining or img2img pass after the paint.


Left: the noised image, right: the result.

Hope this is useful information, cheers for this!

painebenjamin avatar Dec 23 '23 20:12 painebenjamin

@painebenjamin Thanks a lot for your detailed reply and information provided! I read it carefully and made some tests. Here are my findings: 1. the 1.0 issue Diffusers: In diffusers, the code is different when strength is 1.0 and not 1.0 (like 0.99). We can analyze the codes in pipeline_stable_diffusion_xl_inpaint.py, def prepare_latents. When strength is 1.0, the param is_strength_max in def prepare_latents will be True, otherwise will be False.

When the param is_strength_max is True, the latents will be totally noise; otherwise, the latents will be a combination of image_latents and noise. So it behaves differently when stength is 1.0 and not.

By code analysis and doing some tests, we can find that stength is 1.0 prone to worse results. So I think what you did is totally right to automatically apply reduce it to 0.99 if the user passes strength=1.0.

WebUI: In webui, the inference code is not the same as diffusers. There is no code behaves like strength=1.0 in diffusers and it all behaves like strength!=1.0 in diffusers. Coincidentally, when user passes denoising_strength=1.0, it will automatically reduce it to 0.999.

2. outpainting I found the same scene as you mentioned. While the webui provides 4 kinds of method to fill masked content, the other 3 methods can work. The method called original which will refer to the origin image when outpainting cannot work in this situation. Users need to pay attention to it. image

Thank you again for your informations, cheers~

wangqyqq avatar Dec 25 '23 12:12 wangqyqq

@wangqyqq

so for me works SDXL in main branch

can you prepare the 4 fiels for the DEV - branch this is related to torch i think, my idea is to have an onnx model. for normal SDXL model all is running fine

i tryed with your 4 files but there the inpaint model fails to load

kalle07 avatar Dec 26 '23 20:12 kalle07

@kalle07 I tested on dev branch and it works. Can you show the error log?

wangqyqq avatar Dec 27 '23 02:12 wangqyqq

ok with the original model its works in DEV but not with the one i merged...

in Master branch all runnign fine with my model ...

the error in DEV is more or less depend on that the sd_xl_inpaint.yaml is not called if i choose my model, its only called if i choose the original model. .......... Calculating sha256 for D:\stable-A1111-DEV\stable-diffusion-webui\models\Stable-diffusion\inpaint\SDXL_Inpaint_juggerV7_new.safetensors: 1a627292276b0a6df35f78f1eb5d984182d9e1fad29527afe23d13206219d5bc Loading weights [1a62729227] from D:\stable-A1111-DEV\stable-diffusion-webui\models\Stable-diffusion\inpaint\SDXL_Inpaint_juggerV7_new.safetensors Applying attention optimization: sdp-no-mem... done. Weights loaded in 19.5s (send model to cpu: 2.1s, calculate hash: 13.9s, load weights from disk: 0.8s, apply weights to model: 0.8s, move model to device: 1.8s) ...........

(but in Master the yaml file is called and all work also with my model)

i note that the original "processing.py" in DEV is 1000bytes bigger than yours maybe ... but i have no idea ;)

kalle07 avatar Dec 27 '23 15:12 kalle07

@kalle07 The files are made from master branch. If you are working on dev branch, do not just replace the 4 files. You need to merge the different codes to files in dev branch.

wangqyqq avatar Dec 28 '23 03:12 wangqyqq

Works well even with merges. To install you need to download patch of this pr: https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/14390.diff

Then open command line in your webui directory. And run: git apply <path_to_the_diff_file>

light-and-ray avatar Dec 28 '23 13:12 light-and-ray

the merge worked THX !!! now the inpaint yaml worked on my inpaint-model that worked fine on master BUT in DEV branch

NAN error: ....... Loading weights [1a62729227] from D:\stable-A1111-DEV\stable-diffusion-webui\models\Stable-diffusion\inpaint\SDXL_Inpaint_juggerV7_new.safetensors Creating model from config: D:\stable-A1111-DEV\stable-diffusion-webui\configs\sd_xl_inpaint.yaml Applying attention optimization: sdp-no-mem... done. Model loaded in 5.0s (load weights from disk: 1.1s, create model: 0.5s, apply weights to model: 2.6s, load textual inversion embeddings: 0.3s, calculate empty prompt: 0.2s). Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). Startup time: 16.6s (prepare environment: 3.0s, import torch: 3.8s, import gradio: 1.1s, setup paths: 0.7s, initialize shared: 0.2s, other imports: 0.5s, setup codeformer: 0.2s, load scripts: 1.5s, create ui: 5.1s, gradio launch: 0.3s). 0%| | 0/26 [00:00<?, ?it/s] *** Error completing request *** Arguments: ('task(ngzs8inradas1fe)', 0, 'girls', '', [], <PIL.Image.Image image mode=RGBA size=832x1408 at 0x1DB43454B20>, None, None, None, None, None, None, 34, 'Euler a', 4, 0, 1, 1, 1, 7, 1.5, 0.75, 0, 1408, 832, 1, 0, 0, 32, 0, '', '', '', [], False, [], '', <gradio.routes.Request object at 0x000001DB43454760>, 0, False, 1, 0.5, 4, 0, 0.5, 2, False, '', 0.8, -1, False, -1, 0, 0, 0, '* CFG Scale should be 2 or lower.', True, True, '', '', True, 50, True, 1, 0, False, 4, 0.5, 'Linear', 'None', '

Recommended settings: Sampling Steps: 80-100, Sampler: Euler a, Denoising strength: 0.8

', 128, 8, ['left', 'right', 'up', 'down'], 1, 0.05, 128, 4, 0, ['left', 'right', 'up', 'down'], False, False, 'positive', 'comma', 0, False, False, 'start', '', '

Will upscale the image by the selected scale factor; use width and height sliders to set tile size

', 64, 0, 2, 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0, False) {} Traceback (most recent call last): File "D:\stable-A1111-DEV\stable-diffusion-webui\modules\call_queue.py", line 57, in f res = list(func(*args, **kwargs)) File "D:\stable-A1111-DEV\stable-diffusion-webui\modules\call_queue.py", line 36, in f res = func(*args, **kwargs) File "D:\stable-A1111-DEV\stable-diffusion-webui\modules\img2img.py", line 238, in img2img processed = process_images(p) File "D:\stable-A1111-DEV\stable-diffusion-webui\modules\processing.py", line 768, in process_images res = process_images_inner(p) File "D:\stable-A1111-DEV\stable-diffusion-webui\modules\processing.py", line 902, in process_images_inner samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts) File "D:\stable-A1111-DEV\stable-diffusion-webui\modules\processing.py", line 1589, in sample samples = self.sampler.sample_img2img(self, self.init_latent, x, conditioning, unconditional_conditioning, image_conditioning=self.image_conditioning) File "D:\stable-A1111-DEV\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 188, in sample_img2img samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs)) File "D:\stable-A1111-DEV\stable-diffusion-webui\modules\sd_samplers_common.py", line 261, in launch_sampling return func() File "D:\stable-A1111-DEV\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 188, in samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs)) File "C:\ProgramData\anaconda3\envs\stable-diffusion-webui\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "D:\stable-A1111-DEV\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\sampling.py", line 145, in sample_euler_ancestral denoised = model(x, sigmas[i] * s_in, **extra_args) File "C:\ProgramData\anaconda3\envs\stable-diffusion-webui\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "C:\ProgramData\anaconda3\envs\stable-diffusion-webui\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "D:\stable-A1111-DEV\stable-diffusion-webui\modules\sd_samplers_cfg_denoiser.py", line 217, in forward devices.test_for_nans(x_out, "unet") File "D:\stable-A1111-DEV\stable-diffusion-webui\modules\devices.py", line 206, in test_for_nans raise NansException(message) modules.devices.NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.

kalle07 avatar Dec 28 '23 15:12 kalle07

@kalle07 you can try --no-half-vae or --no-half param.

wangqyqq avatar Dec 29 '23 03:12 wangqyqq

@wangqyqq nop allways out of mem and i have 16GB Vram, but maybe DEV is different... and my goal is to calculate an tensor onnx file from the inpaint so even faster ... but thats the next step ... maybe all works allready in main ? (i only get the tensor running only in DEV one month ago)

kalle07 avatar Dec 29 '23 09:12 kalle07

Works well even with merges. To install you need to download patch of this pr: https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/14390.diff

Then open command line in your webui directory. And run: git apply <path_to_the_diff_file>

Folks, this is important. Before this instruction, i was not able to load the inpainting model to A1111. Just do what is instructed here, and it works!! Thank youuu

simartem avatar Dec 29 '23 13:12 simartem

There is a problem with merges and vae:

  1. SDXL Base Inpaint. Okey with nothing special: 00118-2438248406
  2. I merged: A - sdxl base inpaint, B - juggernautXLv7, C - sdxl base, M=1, add difference, fp16. There are glich-like artifacts: 00120-2438248406
  3. Use sdxl vae (or baked vae option in merge): error: NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.
  4. Use sdxl-vae + --no-half-vae argument: Okay! 00122-2438248406

So there is a question: why the base model doesn't require --no-half-vae and produces artifactless images? One more thing: if I select sdxl vae for base model, there is the same error too. GPU: RTX3060 12GB

light-and-ray avatar Dec 29 '23 18:12 light-and-ray

I found the solution, it's needed to download fp16-fix sdxl vae, and use/bake it instead of original: https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/blob/main/sdxl_vae.safetensors!

light-and-ray avatar Dec 29 '23 19:12 light-and-ray

So there is a complete recipe:

A: sd_xl_base_1.0_inpainting_0.1.safetensors B: your SDXL model C: sd_xl_base_1.0_0.9vae.safetensors M = 1 Interpolation Method = Add difference Save as fp16 = True Bake in VAE: sdxl-vae-fp16-fix

Works for realVisXL 3.0 too for me

light-and-ray avatar Dec 29 '23 19:12 light-and-ray

@light-and-ray You can use the checkpoints I have prepared and avoid having to bake in another VAE, they are the ones mentioned by @wangqyqq in the initial issue and then referenced by @gel-crabs a little after.

https://huggingface.co/benjamin-paine/sd-xl-alternative-bases

Interpolation Method = Add difference A is sd_xl_base_1.0_inpainting_0.1.safetensors B is your fine-tuned checkpoint C is sd_xl_base_1.0_fp16_vae.safetensors

painebenjamin avatar Dec 29 '23 19:12 painebenjamin

@light-and-ray

hey Jugger was also my first try and i go same steps like you, with fp16 and VAE ... I thought that was normal ^^ so now may its works like painebenjamin wrote ...

kalle07 avatar Dec 30 '23 09:12 kalle07

@light-and-ray You can use the checkpoints I have prepared and avoid having to bake in another VAE, they are the ones mentioned by @wangqyqq in the initial issue and then referenced by @gel-crabs a little after.

https://huggingface.co/benjamin-paine/sd-xl-alternative-bases

Interpolation Method = Add difference A is sd_xl_base_1.0_inpainting_0.1.safetensors B is your fine-tuned checkpoint C is sd_xl_base_1.0_fp16_vae.safetensors

I am gonna test this now i wonder will there be any difference

FurkanGozukara avatar Dec 30 '23 11:12 FurkanGozukara

I did a test and it is bad ,

image

regular sdxl 1.0

image

inpaint model with no half vae

image

inpaint model without no half vae

image

FurkanGozukara avatar Dec 30 '23 12:12 FurkanGozukara

@FurkanGozukara hey, i saw some nice videos for lora training, THY iam happy that the world of "freaks" is such as small ;)

i have upload a juggerXL inpaint-model on civitai ... it should work in automatic1111

someone suggested FooocusControl, which I mentioned in my description there

kalle07 avatar Dec 30 '23 14:12 kalle07

@FurkanGozukara hey, i saw some nice videos for lora training, THY iam happy that the world of "freaks" is such as small ;)

i have upload a juggerXL inpaint-model on civitai ... it should work in automatic1111

someone suggested FooocusControl, which I mentioned in my description there

thanks

FurkanGozukara avatar Dec 30 '23 14:12 FurkanGozukara

@wangqyqq hey i found an SXDL_pix2pix https://huggingface.co/diffusers/sdxl-instructpix2pix-768

former in sd15 it was the instruct-pix2pix.yaml that provide the right stuff the SDXL file dont work ... can you help?

kalle07 avatar Jan 05 '24 20:01 kalle07

@wangqyqq hey i found an SXDL_pix2pix

It is not pix2pix, it's random fine tuning or even merge, which the author called pix2pix just for bite. There are a lot of similar sd checkpoints with midjourney/dall-e/nai3 in the names

light-and-ray avatar Jan 05 '24 20:01 light-and-ray

@wangqyqq (i can not reply on your message) ?

i mean is it not similar to SD15? I mentioned here https://civitai.com/models/105817/img2img-pix2pix-realv30 you have a second slider for image CFG

and again these SDXL checkpoint dont work in automatic1111 i think a new instruct-pix2pix.yaml must be generated, what do you think ?

kalle07 avatar Jan 06 '24 08:01 kalle07

So there is a complete recipe:

A: sd_xl_base_1.0_inpainting_0.1.safetensors B: your SDXL model C: sd_xl_base_1.0_0.9vae.safetensors M = 1 Interpolation Method = Add difference Save as fp16 = True Bake in VAE: sdxl-vae-fp16-fix

Works for realVisXL 3.0 too for me

Does this method apply to Turbo models as well?

cgrossi avatar Jan 13 '24 21:01 cgrossi