ComfyUI Inpainting models barely apply any change even at denoise 1

When using an inpainting model there's almost no change happening to the masked region, even with a denoise value of 1. However, when using the same flow with a regular model it seems to work.

With inpainting model: Screenshot from 2023-08-11 03-51-21

With regular model: Screenshot from 2023-08-11 03-56-49

Aug 11 '23 00:08 ramyma

Inpaint models need the "VAE Encode for inpaint" node or else it does that.

Aug 11 '23 01:08 comfyanonymous

But with "VAE Encode for inpaint" it's not usable with low denoise values; as it adds a grey fill to the masked region

Aug 11 '23 01:08 ramyma

That grey fill is what the inpaint model expects.

Aug 11 '23 01:08 comfyanonymous

This is how it looks with denoise 0.5: Screenshot from 2023-08-11 04-15-43

Aug 11 '23 01:08 ramyma

Is there a way to replicate A1111's inpaint with an inpainting model and fill method "original?

Aug 11 '23 01:08 ramyma

Is there a way to replicate A1111's inpaint with an inpainting model and fill method "original?

You have to use SetLatentNoise instead of VAEEncode (for inpaint)

Aug 11 '23 02:08 ltdrdata

Is there a way to replicate A1111's inpaint with an inpainting model and fill method "original?

You have to use SetLatentNoise instead of VAEEncode (for inpaint)

@ltdrdata VAEEncode should be used with inpainting models accourding to @comfyanonymous: https://github.com/comfyanonymous/ComfyUI/issues/1186#issuecomment-1674105576

Also, I used SetLatentNoise in my first comment (check the screenshots) https://github.com/comfyanonymous/ComfyUI/issues/1186#issue-1846063953

Aug 11 '23 02:08 ramyma

If you want to use any of the original image, im fairly sure you need to make use of the setlatentnoise method and normal model rather than inpaint model. coupla tutorials that might help ya out I made. https://youtu.be/g9JXx4ik_rc - basic masking stuff https://youtu.be/Yc13jDr9VRk - advanced masking and compositing stuff

Aug 11 '23 03:08 Ferniclestix

I've already tried the proposed workflows, however, the reason I opened this issue is that there is a missing functionality when using inpainting models; having a denoise value lower than 1 while using the original pixel values of the input image, this is one of the main benefits of using an inpainting model. Here's an example from A1111: Screenshot from 2023-08-11 13-32-18

As you can see the consistency is much better than using a denoise value of 1 with "VAE Encode for inpaint" and an inpainting model.

Aug 11 '23 10:08 ramyma

Did you find a solution for this, that damn gray also affects the color of inpainting if the area is large.

Aug 12 '23 20:08 Lalimec

Did you find a solution for this, that damn gray also affects the color of inpainting if the area is large.

No, I was hoping I would get an answer here. I believe it's an issue/missing feature.

Aug 13 '23 00:08 ramyma

did you try this https://github.com/comfyanonymous/ComfyUI/issues/668

Aug 13 '23 01:08 Lalimec

did you try this #668

Yes, but it doesn't seem to have a noticeable effect with inpainting models

Aug 13 '23 02:08 ramyma

I also had the same problem. Even if a month has passed, I will leave my observations here, maybe it will be useful to someone. The inpaint model really doesn't work the same way as in A1111. It is necessary to use VAE Encode (for inpainting) and select the mask exactly along the edges of the object. In the first example (Denoise Strength 0.71), I selected only the lips, and the model repainted them green, almost leaving a slight smile of the original image. In the second example (Denoise Strength 0.8) I masked the entire mouth area - and the model drew green lips, but lost the original concept. Set Latent Noise Mask does not work for the Inpaint model. 2023-09-14_19-36-06 2023-09-14_19-45-23 Sorry for a little mess on a screen :)

Sep 14 '23 17:09 sinelka

Definitely agree that it's an issue and I hope ot gets resolved. Right now, inpaintng in ComfyUI is deeply inferior to A1111, which is letdown..

I'll reiterate: Using "Set Latent Noise Mask" allow you to lower denoising value and get profit from information already on the image(e.g. you sketched something yourself), but when using Inpainting models, even denoising of 1 will give you an image pretty much identical to the original..

When not using Inpainting model - you can use "Set Latent Noise Mask" approach, however, using A1111 with lower denoise value and Inpainting model give much much better results. So, there is a lot of value of allowing us to use Inpainting model with "Set Latent Noise Mask".

The only way to use Inpainting model in ComfyUI right now is to use "VAE Encode (for inpainting)", however, this only works correctly with the denoising value of 1. And that means we can not use underlying image(e.g. sketch stuff ourselves).

What I expect: "Set Latent Noise Mask" works fine with inpainting models, changing image according to denoising value, same as A1111. Or "VAE encode (for inpainting)" with a lower denoising value should use underlying image instead of grey, when using lower denoising value.

What happens: "Set Latent Noise Mask" with inpainting model changes almost nothing with any denoising value(including 1), while "VAE encode (for inpainting)" only works correctly with denoising value of 1, hence making it imposible to use underlying image efficiently or at all.

Nov 01 '23 11:11 ChieVFX

Does anyone here understand both Comfy and A1111 codebases well enough to why Comfy's inpainting behaviour is so different?

I spent a couple hours digging into how the VAEEncodeForInpaint, SetLatentNoiseMask, and KSampler nodes work, hoping I could submit a patch PR. So far I can see that the denoising/scheduling/masking logic appear to be equivalent, but I can't isolate the offending difference yet. It's gonna take a lot more time to grok without help.

@comfyanonymous any chance you could point us in the right direction?

Nov 08 '23 00:11 ansonkao

I've identified a key difference!

In Automatic1111 webui, time elapsed during img2img is directly proportional to denoising strength, i.e.:

For denoising = 1.0, generation takes X seconds
For denoising = 0.5, generation takes 0.5X seconds
For denoising = 0.01, generation takes 0.01X seconds

In ComfyUI, time elapsed is constant regardless of the denoising strength

Now I know what to look for in the codebase! Stay tuned

Nov 08 '23 22:11 ansonkao

For denoising = 1.0, generation takes X seconds

For denoising = 0.5, generation takes 0.5X seconds

For denoising = 0.01, generation takes 0.01X seconds

The A1111 uses less steps proportional to the denoising, that is why it is faster with less denoising, there is a configuration in A1111 that disables it, to have the same in ComfyUI just use less steps.

Nov 14 '23 00:11 jn-jairo

то же самое в ComfyUI, просто используйте меньше шагов

It won't work. At 1-2 steps it will be low quality, but at 3 you will already see the original.

Nov 23 '23 18:11 Gloynus

Bump, no update on this issue? From my understanding Inpainting models use a mask as an extra input, the one fed to the inpainting model is wrong for some reason when using Masked Latent Node. Resulting in the inpainting model using the fill data as a reference to inpaint (and as the fill is actually decent see no reason to change anything).

This is a huge limitation especially when trying to use Lama as a preprocessor to get a rough fill before using the inpaint model.

Dec 26 '23 20:12 ntrouve-onera

I haven't fully tested it yet but it seems I can get around this issue by using a normal vaeencode on the original image along with the vaeencode for inpainting and doing a latentblend.

Dec 27 '23 20:12 JoeNavark

@JoeNavark and with 50% blend you get roughly what you could have expected from 0.5 denoise? (more or less?).

Dec 27 '23 22:12 ntrouve-onera

This does somewhat work, you need to add an extra latent mask or it just inpaint everything. Still some feathering/edge issues to solve but it appears to be doing what we want.

Dec 27 '23 23:12 ntrouve-onera

That's weird, I didn't have to add the second mask as long as the masked encode went to samples1.

Dec 28 '23 09:12 JoeNavark

I may have switched samples1 and 2 before doing the screenshot to clean up so it might be because of that.

Dec 28 '23 10:12 ntrouve-onera

I solved this problem by modifying some code. The Inpainting model requires receiving "noisy images" and "masked images" as inputs. WEBUI and ComfyUI use different processing methods, as shown in the following figure.

workflow

To solve this problem, the code in four places of the three files needs to be modified.

nodes.VAEEncodForInpaint.encode() , need to output "latent_image" and "masked_ latent_ image"
nodes.common_ksampler(), need to add "masked_latent_image" to positive and negative condition
comfy.sampler.sample(), do model.process_latent_in() on the "masked_latent_image"
comfy.model_base.BaseModel.extra_conds(), use "masked_latent_image" instead of "latent_image" as element of "cond_concat"

If the project manager agrees to this plan, I would be happy to submit a PR to fix this issue.

Jan 07 '24 12:01 xucj98

I solved this problem by modifying some code. The Inpainting model requires receiving "noisy images" and "masked images" as inputs. WEBUI and ComfyUI use different processing methods, as shown in the following figure.

To solve this problem, the code in four places of the three files needs to be modified.

nodes.VAEEncodForInpaint.encode() , need to output "latent_image" and "masked_ latent_ image"

nodes.common_ksampler(), need to add "masked_latent_image" to positive and negative condition

comfy.sampler.sample(), do model.process_latent_in() on the "masked_latent_image"

comfy.model_base.BaseModel.extra_conds(), use "masked_latent_image" instead of "latent_image" as element of "cond_concat"

If the project manager agrees to this plan, I would be happy to submit a PR to fix this issue.

Finally, an analysis has been conducted to identify the implementation differences. Good job!

Jan 07 '24 12:01 ltdrdata

Thanks a lot, supporting the idea of the PR. Indeed the workaround is still not as good as a true partial denoise because the latent blend appears to mess up with noise offset resulting in some tint missmatch.

Jan 08 '24 14:01 ntrouve-onera

I submitted a PR #2501

Jan 09 '24 09:01 xucj98

ComfyUI ComfyUI copied to clipboard

Inpainting models barely apply any change even at denoise 1

ComfyUI
ComfyUI copied to clipboard