Support GGUF
Feature Idea
I have tested the quantization of gguf on SD3 and Flux, and the results are great, smaller memory occupancy and faster speed, hope to support
Existing Solutions
https://github.com/city96/ComfyUI-GGUF
Other
https://github.com/leejet/stable-diffusion.cpp
I hope this lora issue, for GGUF models, in lowvram, is also fixed. https://github.com/city96/ComfyUI-GGUF/issues/33
EDIT : Fixed !
Having gguf built in instead of in a custom node would be very useful, as newer models are larger and larger, and needing things like gguf to run on consumer grade hardware more and more. ComfyUI-GGUF is already becoming an essential custom node to install, and anything along that line should generally be folded into the main project.
GGUF only supports reduced storage and memory when not an LLM model AFAIK, with some issues/workarounds required for applying LoRA due to dequantization upcasting the data type I think?
Those concerns don't apply with some alternative quantization formats IIRC, some of which also quantize the activations which provides faster performance too.