diffusers How to - Image2Image with (Image + Text) Guided!

I've an image with target object. And I like to generate new image with text-guided background with the target object in a natural view. In short

An image with target object (i.e. car).
Complex backgrouond removed by Model A.
Then pass it to Model B with text / prompt.
Generate new images with different poses of the target object with new background defined by prompt.

Here, I'm looking for Model B. Can it be achieved with diffusers? Any pointer?

Nov 11 '22 12:11 innat

Interesting! Maybe this could work: https://github.com/huggingface/diffusers/issues/1305 -> we should have it somewhat soon :-)

Nov 16 '22 21:11 patrickvonplaten

Wow, you're right. Though the pose of the target object is fixed but this is pretty close.

https://arxiv.org/pdf/2211.07825.pdf

Nov 17 '22 00:11 innat

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Dec 11 '22 15:12 github-actions[bot]

Still open for any suggestion and recommendation. ( no github-action bot please, 🔕 )

Dec 12 '22 19:12 innat

BTW, Stable Diffusion Depth Estimation should work quite well for this: https://huggingface.co/stabilityai/stable-diffusion-2-depth

Dec 16 '22 14:12 patrickvonplaten

@patrickvonplaten https://twitter.com/matthieurouif/status/1589539286814461953

Dec 21 '22 01:12 innat

This would make a nice pipeline here as well :-)

Jan 03 '23 11:01 patrickvonplaten

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Jan 27 '23 15:01 github-actions[bot]

@patrickvonplaten Could you please open it?

Feb 05 '23 15:02 innat

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Mar 04 '23 15:03 github-actions[bot]

Candidates: https://huggingface.co/docs/diffusers/using-diffusers/controlling_generation

Mar 09 '23 04:03 innat

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Apr 02 '23 15:04 github-actions[bot]

Probably this can be gained from recent strategies, i.e. controlnet.

Apr 03 '23 12:04 innat

CC @patrickvonplaten

Subject-driven Text-to-Image Generation via Apprenticeship Learning https://arxiv.org/abs/2304.00186#

Apr 04 '23 17:04 innat

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Apr 29 '23 15:04 github-actions[bot]

diffusers diffusers copied to clipboard

How to - Image2Image with (Image + Text) Guided!

diffusers
diffusers copied to clipboard