stable-diffusion-webui [Feature Request]: Load big model to main RAM and reduce for VRAM

[Feature Request]: Load big model to main RAM and reduce for VRAM

Open binarydepth opened this issue 11 months ago • 2 comments

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

Adding a command line argument have a big model load on main RAM and based on the prompt and max memory usage set seng a biased model to VRAM.

Proposed workflow

Invoke WEB UI with argument --bias-to-prompt --max-mem=6GB
Model is loaded on main RAM and biased for the prompt
Gets sent to GPU
Produce image from biased model in VRAM

Additional information

This is intended to be used with 8GB VRAM max capacity GPU's. I guess transformers could be used for this but I am guessing. So this way we could use bigger models on 8 GB GPU's

Mar 07 '24 20:03 binarydepth

If any use, ChatGPT says this can be done on CPU:

Pruning: Prune the model to remove unnecessary connections or weights, reducing its size and memory footprint. Chunking: Divide the pruned model into smaller modules or components to facilitate dynamic loading and execution of only the necessary parts. Parameter Sharing: Apply weight sharing techniques to exploit redundancy within the model architecture, further reducing the number of unique parameters. Dynamic Graph Execution: Use dynamic computation graphs to construct and execute only the relevant parts of the model graph during inference, minimizing memory allocation for unused portions.

I tried to have the AI write code FWIW: CODE1 CODE2

Mar 12 '24 15:03 binarydepth

I mean, ChatGPT doesn't really have all the code context or anything... Loading the model onto RAM is possible, but having your GPU exchange information with your CPU RAM is really expensive in time and performance, especially for a ~10GB checkpoint/model... I'm no contributor to the project so I'm just giving my two cents here, but I don't think this is much of a good idea to do, and/or it's not really that easy.

Mar 19 '24 11:03 tocram1

stable-diffusion-webui stable-diffusion-webui copied to clipboard

[Feature Request]: Load big model to main RAM and reduce for VRAM

Is there an existing issue for this?

What would your feature do ?

Proposed workflow

Additional information

stable-diffusion-webui
stable-diffusion-webui copied to clipboard