latent-diffusion
latent-diffusion copied to clipboard
Text + partial image prompting
Hi !
In Dall-E, we can provide a partial image in addition to the text description so that the model only completes the image. See:
Can we do the same with your models? That would be awesome. I tried to modify the LAION-400M model notebook but without much success.
Good question! I am also being curious whether it is possible with the provided source code.
@hyungkwonko Yes it sure does, I already implemented and tested on my own, need some coding
Just pass the x0 and mask into the DDIM sampler, and you can do inpainting while using text prompts
@cachett-ML for your case, mask the image (eg. only the top half), then do the inpainting with text prompts, however, you can do the same, but I have no clue if it would work... IMO, a more decent approch may consider conditioning on visible areas during training, rather than utilizing this text conditioned txt2img model to do the inpainting task, but this model is so far the best we can get our hands on : )
python scripts/txt2img.py --prompt "a cat is wearing a green hat" --image_prompt test_inputs/images/images.jpeg --mask_prompt test_inputs/masks/images.png --ddim_eta 0.0 --n_samples 8 --n_iter 4 --scale 5.0 --ddim_steps 50
Hope it helps, cheers
For the complete code of txt2img.py, i made a pull request below, see
https://github.com/CompVis/latent-diffusion/pull/57
Hey @lxj616, That looks pretty awesome! Thanks for your kind reply & great work. Will try and share my own one.