diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

How to - Image2Image with (Image + Text) Guided!

Open innat opened this issue 3 years ago • 5 comments

I've an image with target object. And I like to generate new image with text-guided background with the target object in a natural view. In short

  • An image with target object (i.e. car).
  • Complex backgrouond removed by Model A.
  • Then pass it to Model B with text / prompt.
  • Generate new images with different poses of the target object with new background defined by prompt.

Here, I'm looking for Model B. Can it be achieved with diffusers? Any pointer?

innat avatar Nov 11 '22 12:11 innat

Interesting! Maybe this could work: https://github.com/huggingface/diffusers/issues/1305 -> we should have it somewhat soon :-)

patrickvonplaten avatar Nov 16 '22 21:11 patrickvonplaten

Wow, you're right. Though the pose of the target object is fixed but this is pretty close.

https://arxiv.org/pdf/2211.07825.pdf

image

innat avatar Nov 17 '22 00:11 innat

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Dec 11 '22 15:12 github-actions[bot]

Still open for any suggestion and recommendation. ( no github-action bot please, 🔕 )

innat avatar Dec 12 '22 19:12 innat

BTW, Stable Diffusion Depth Estimation should work quite well for this: https://huggingface.co/stabilityai/stable-diffusion-2-depth

patrickvonplaten avatar Dec 16 '22 14:12 patrickvonplaten

@patrickvonplaten https://twitter.com/matthieurouif/status/1589539286814461953

innat avatar Dec 21 '22 01:12 innat

This would make a nice pipeline here as well :-)

patrickvonplaten avatar Jan 03 '23 11:01 patrickvonplaten

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Jan 27 '23 15:01 github-actions[bot]

@patrickvonplaten Could you please open it?

innat avatar Feb 05 '23 15:02 innat

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Mar 04 '23 15:03 github-actions[bot]

Candidates: https://huggingface.co/docs/diffusers/using-diffusers/controlling_generation

innat avatar Mar 09 '23 04:03 innat

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Apr 02 '23 15:04 github-actions[bot]

Probably this can be gained from recent strategies, i.e. controlnet.

innat avatar Apr 03 '23 12:04 innat

CC @patrickvonplaten

Subject-driven Text-to-Image Generation via Apprenticeship Learning https://arxiv.org/abs/2304.00186#

image

innat avatar Apr 04 '23 17:04 innat

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Apr 29 '23 15:04 github-actions[bot]