stable-diffusion-webui
stable-diffusion-webui copied to clipboard
[Feature Request]: Add ability to merge images ad hoc
Is there an existing issue for this?
- [X] I have searched the existing issues and checked the recent builds/commits
What would your feature do ?
Feature allows you to merge two images using BLIP/CLIP interrogate and then blending image latents and conditioning.
Proposed workflow
Just follow the demo code here, I prepopulated the prompts but you can use BLIP/CLIP to generate them.
https://gist.github.com/AmericanPresidentJimmyCarter/b4b69daa577936cb72aec4db44d0a2ea
Additional information
No response
Q: Which method of combining 2 input images is better? https://github.com/DiceOwl/StableDiffusionStuff/blob/main/interpolate.py
readme: https://github.com/DiceOwl/StableDiffusionStuff#interpolate
I hope eventually we can make an interpolation video between 2 inputs.
I just mix values randomly, lerp, or slerp. I apply it to both the image latents and the conditionings. Doing a video should be pretty trivial, just iterate over np.linspace
for n many frames.
Thanks, would add this to wiki but its not so user friendly atm
Here are some scripts for video. This one just uses the BLIPed/CLIP ranked prompts.
Script: https://gist.github.com/AmericanPresidentJimmyCarter/159b6fc3a538ae0221a58967dfc2b705
Example:
https://user-images.githubusercontent.com/110263573/200696942-c21d8a28-33cd-4785-acf8-0c2da0432789.mp4
Here is another script where we define what the expected prompt is for the midstate. Here I morph a photo of a pigeon into a photo of a man that someone sent me, with a description of the midstate.
Script: https://gist.github.com/AmericanPresidentJimmyCarter/790c9ae23ff0831a74d9a48977ee712d
https://user-images.githubusercontent.com/110263573/200697290-5490e0c7-5f30-4dd0-a84c-231bc51b1301.mp4
In both cases I found that the most significant differences occur around the midstate, so I weights the transition towards that with for itr, i in enumerate(np.linspace(0., 1., STEPS_IN_OUT)**(1/2))
.
has this been implemented as a user script under A1111 ? (similarly to the "interpolate" one by DiceOwl)