ComfyUI Support GGUF

Feature Idea

I have tested the quantization of gguf on SD3 and Flux, and the results are great, smaller memory occupancy and faster speed, hope to support

Existing Solutions

https://github.com/city96/ComfyUI-GGUF

Other

https://github.com/leejet/stable-diffusion.cpp

Aug 25 '24 09:08 mliand

I hope this lora issue, for GGUF models, in lowvram, is also fixed. https://github.com/city96/ComfyUI-GGUF/issues/33

EDIT : Fixed !

Aug 26 '24 10:08 JorgeR81

Having gguf built in instead of in a custom node would be very useful, as newer models are larger and larger, and needing things like gguf to run on consumer grade hardware more and more. ComfyUI-GGUF is already becoming an essential custom node to install, and anything along that line should generally be folded into the main project.

May 25 '25 21:05 arcum42

GGUF only supports reduced storage and memory when not an LLM model AFAIK, with some issues/workarounds required for applying LoRA due to dequantization upcasting the data type I think?

Those concerns don't apply with some alternative quantization formats IIRC, some of which also quantize the activations which provides faster performance too.

May 25 '25 21:05 polarathene