stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

Option to unload checkpoint or only load checkpoint until used

Open zeptofine opened this issue 2 years ago • 1 comments

Is your feature request related to a problem? A big problem I have is the RAM and VRAM gets filled with models that I may never use For example, if I were to use this only to upscale in the extras tab, there's no reason to load the SD checkpoint on startup.

Describe the solution you'd like A button in the settings that clears the models from RAM and VRAM, or an option to only load them once they're utilized. When enabled, the program will wait until something requires the SD models before loading them. Useful if you use other options that don't require the model to be loaded. Doesn't greatly affect things like gradio.app, or Colab I can imagine.

Describe alternatives you've considered Other upscaling software, like chaiNNer and Cupscale. Though the problem is they don't have models built in or at least the links built in and they're slow to set up relative to this repo. And neither chaiNNer or Cupscale can be forwarded or remotely controlled.

Additional context 1: image 2: image

zeptofine avatar Oct 05 '22 15:10 zeptofine

This would be a really great feature to have, as the more switches you make between checkpoints the more your VRAM seems to fill up which eventually force you into restarting.

Daralimah avatar Oct 22 '22 14:10 Daralimah

Settings/Stable Diffusion image And it shouldnt occupy VRAM unless it's loaded. That's probably some old vram memory leak bug.

mezotaken avatar Jan 12 '23 23:01 mezotaken

Settings/Stable Diffusion image And it shouldnt occupy VRAM unless it's loaded. That's probably some old vram memory leak bug.

@mezotaken Thanks, I'm glad this repo now has this feature.

zeptofine avatar Jan 16 '23 19:01 zeptofine

Ollama framework has a really handy environment and API accessible variable:

OLLAMA_KEEP_ALIVE=[# of seconds] | [xM] | 0

I think it's mostly used for people who want the last loaded chat model to stay loaded longer. But I use it set to zero to keep the GPU VRAM as empty as possible as soon as possible. This is because I have many users that mostly use the GPU for chat and occasionally for Text-to-speech and SD image creation - loading up the GPU VRAM. Unfortunately SDWeb keeps its last model loaded indefinitely. It would be great if SDWeb had a similar Keep Alive option to let us decide how long to keep the last model loaded.

randelreiss avatar Jul 03 '24 22:07 randelreiss