stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

[Feature Request]: Batch script for txt2img for higher quality generations from multiple image sources/animation frames OR ...

Open marcsyp opened this issue 2 years ago • 2 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

I have been experimenting with using img2img with a source image and the same image to control all of my controlnet networks vs doing the same exact setup in txt2img (i.e., without the source image as a starting point) -- and I have found the txt2img to provide consistently better results, even after playing with weights, etc, in img2img.

However, txt2img doesn't currently allow you to process images in batch (makes sense). Current workflow to process a large number of images is painful, involving replacing the source image for each control net one by one, clicking generate, and then moving to the next (at least this is less painful using the new tab interface, but still painful).

Proposed workflow

  1. Load a "ControlNet Batch" script
  2. Provide a directory of source images (ideally one directory for each controlnet, up to 10)
  3. For each image, the script replaces the controlnet source image (for each controlnet) with the corresponding image for that directory and runs a txt2img, moves to the next.

Additional Features (Nice to haves) ability to vary the prompt by index position (directory for text files with different prompts with a way to assign each file to multiple index positions, with the A1111 input as a fallback?)

Another thought:

This may be possible to accomplish in img2img by simply providing a checkbox in the controlnet UI to ignore source image (may need to be a global checkbox?)

Thanks!

Additional information

No response

marcsyp avatar Feb 25 '23 15:02 marcsyp

Just a few use cases to think about regarding this feature. These particularly related to animation, but could easily be applied to other use cases.

ANIMATION

  • Using the normal mapping with a background or foreground threshold to isolate subject matter. This works really well in txt2img, but fails miserably in img2img because the background content being isolated still provides weighted noise to the controlnet.
  • using low resolution preprocessing to remove detail from a scene but maintain overall coherence, which works much better in txt2img.

BATCH IMAGES

  • Using standard txt2img results with excellent composition as a low-fidelity proxy for a txt2img controlnet pass that adds high levels of detail without polluting the result with garbage pixel data.
  • The ability to process a batch of content that has been preprocessed elsewhere (for instance, normal maps or depth maps) produced in external applications (Blender, etc), without needing to use the maps themselves as source data in img2img (which pollutes the controlnet result)

marcsyp avatar Feb 25 '23 18:02 marcsyp

Sorry, just realized I'm in the wrong repo :(

marcsyp avatar Feb 25 '23 18:02 marcsyp