IF icon indicating copy to clipboard operation
IF copied to clipboard

What is inpainting_mask in the use of Zero-shot Inpainting?

Open KeyaoZhao opened this issue 2 years ago • 6 comments

I wonder what is inpainting_mask in the use of Zero-shot Inpainting? We should mask the raw_pil_image first? And the model will inpaint the mask part? Thanks a lot!

KeyaoZhao avatar May 18 '23 10:05 KeyaoZhao

I think I have the same question. I would like to add objects to a preexisting photo scene. It seems onerous to have to define some mask, I'd want the added objects simply placed "organically" in the correct/plausible locations.

phalexo avatar May 19 '23 23:05 phalexo

I think I have the same question. I would like to add objects to a preexisting photo scene. It seems onerous to have to define some mask, I'd want the added objects simply placed "organically" in the correct/plausible locations.

I have tried to set the 'support_pil_img'=ori_img without mask and 'inpainting_mask' = One channel mask image. But the result of if_II_kwargs is totally the same as 'support_pil_img', the prompt has no influence on the output? So how should I fix this?

KeyaoZhao avatar May 22 '23 07:05 KeyaoZhao

Hello, I also have the same problem. I have tried to make the shape of the mask to be [h,w], [h,w,3], and [1,h,w,3] but failed all the cases. Did you figure out what the data type and shape the mask should be?

AnranXu avatar Jun 11 '23 05:06 AnranXu

Hello, I also have the same problem. I have tried to make the shape of the mask to be [h,w], [h,w,3], and [1,h,w,3] but failed all the cases. Did you figure out what the data type and shape the mask should be?

I still have no idea how to have the same effect as the example inpainting. But if you want to add text to the image, you can try TextDiffuser.

KeyaoZhao avatar Jun 12 '23 03:06 KeyaoZhao

Thanks. If I figure out how to make it, I will share it here.

AnranXu avatar Jun 12 '23 03:06 AnranXu

Hello, I also have the same problem. I have tried to make the shape of the mask to be [h,w], [h,w,3], and [1,h,w,3] but failed all the cases. Did you figure out what the data type and shape the mask should be?

I managed to make it work after a deep look in the code. What you should provide is a mask of torch.FloatTensor shape [1, 3, h, w]. Set the mask values to 1 where you want the model to modify the image, and 0 where the model should leave the pixels untouched.

Now, in order for this solution to work properly, you'll need to apply the patch available in pull request #64 .

Furthermore, if your image has an aspect ratio that is not well-rounded, the shape of the generated image in the first stage may differ from the shape of the mask and support noise. To address this issue, I have proposed a fix in pull request #125 .

pierrot-lc avatar Jun 19 '23 15:06 pierrot-lc