stable-diffusion-webui-forge
stable-diffusion-webui-forge copied to clipboard
[Feature Request]: Batch Unet then Batch VAE
Is there an existing issue for this?
- [X] I have searched the existing issues and checked the recent builds/commits
What would your feature do ?
Currently, when doing high Batch count
(not Batch size
), it would unload the model and load the encoder, then unload the encoder and load the model again, between every single generation. This reloading takes ~1.5
second in total on my system, slowing down the overall workflow.
Is it possible to add a Toggle, making it only switch to the VAE after all latents are finished processing? Maybe store the intermediate latents in system RAM in the meantime?
Proposed workflow
- Enable some new Toggle
- Generate with high
Batch count
- See all batches of latents being processed first
- Then see all latents being converted to images
- Observe only 1 occurance of loading model
Additional information
Example of Current Console between each Generation:
To load target model AutoencoderKL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) = 6983.69580078125
[Memory Management] Model Memory (MB) = 159.55708122253418
[Memory Management] Minimal Inference Memory (MB) = 1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) = 5800.138719558716
Moving model(s) has taken 0.10 seconds
To load target model SDXL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) = 6983.6875
[Memory Management] Model Memory (MB) = 4897.086494445801
[Memory Management] Minimal Inference Memory (MB) = 1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) = 1062.6010055541992
Moving model(s) has taken 1.24 seconds
Edit: Apparently this only happens when using SDXL checkpoints. Is it because my VRAM is barely enough?
Settings > VAE > Keep VAE in memory = 1