stable-diffusion.cpp icon indicating copy to clipboard operation
stable-diffusion.cpp copied to clipboard

[Feature] Sequential model loading

Open Aatricks opened this issue 1 month ago • 1 comments

Feature Summary

The ability to load models one by one (only load one model at a time when the calculation needs it) to reduce memory usage.

Detailed Description

I'm working on an android library leveraging llama.cpp and stablediffusion.cpp for easy on device inference but I'm memory limited for some models where the encoder, vae and UNet are being loaded at the same time, would it be possible to add sequential model loading where a model is only loaded when it is needed to only have one model loaded at a time, to drastically reduce the memory footprint.

Alternatives you considered

No response

Additional context

No response

Aatricks avatar Nov 16 '25 10:11 Aatricks

Possibly a duplicate of #908 .

wbruna avatar Nov 16 '25 11:11 wbruna