prompt-to-prompt icon indicating copy to clipboard operation
prompt-to-prompt copied to clipboard

code for user-defined mask

Open fabrizioguillaro opened this issue 1 year ago • 3 comments

Hello! I am trying to use your code for "Null-text Inversion for Editing Real Images using Guided Diffusion Models". In particular, since I have an inpainting mask, I am trying to generate an image using a user-defined mask (like shown in fig. 8 or fig. 14 of "Prompt-To-Prompt Image Editing With Cross-Attention Control"). The code for using user-defined mask is missing, so I was trying to implement a way to do that. Did you just apply the given mask instead of the one computed from the prompt in LocalBlend? Could the following code represent what you did (resizing the mask to 64x64, repeating over the 2 channels, applying the mask to the latent space)?

class LocalBlend:
    ...
    def __init__(...)
        ...
        mask = np.array(Image.fromarray(mask).resize((64, 64), Image.NEAREST))
        mask = mask[None,None,:,:]
        mask = mask.repeat(2, axis=0)
        self.mask = torch.from_numpy(mask).cuda()

    def __call__(...)
        ...
        mask = self.mask
        mask = mask.float()
        x_t = x_t[:1] + mask * (x_t - x_t[:1])

fabrizioguillaro avatar Oct 23 '23 13:10 fabrizioguillaro

The code I wrote works (example in the image), I am just wondering if it follows the way you intended to do it.

As you can see, using the given mask, the code above allows me to edit just the pie on the left, instead of all the pies: image

fabrizioguillaro avatar Oct 23 '23 13:10 fabrizioguillaro

The code I wrote works (example in the image), I am just wondering if it follows the way you intended to do it.

As you can see, using the given mask, the code above allows me to edit just the pie on the left, instead of all the pies: image

Thanks for bringing this up. I also have a similar question about replacing the estimated mask with user-provided masks. Could you share the code to reproduce the results shown in the above example? I noticed that the rolling pin on the right was distorted, even with the presence of the mask.

Yutong-Dai avatar Nov 27 '23 17:11 Yutong-Dai

@fabrizioguillaro what if the mask didn't match the position of the pie? like the mask is on the right.. would it still give reasonable results?

AhmedBourouis avatar Mar 15 '24 12:03 AhmedBourouis