ComfyUI icon indicating copy to clipboard operation
ComfyUI copied to clipboard

SDNQ support

Open kanttouchthis opened this issue 3 weeks ago • 2 comments

Feature Idea

SDNQ provides low bit quantization with good quality and performance. Incorporating it for on-the-fly quantization and loading pre-quantized models would be great, especially for larger models like flux.2 where fp8 is too large, even for 24GB gpus. I have tried writing a custom node for this, but failed due to the model and vram management getting in the way. Compared to nunchaku this approach doesn't depend on a custom model implementation from another dev team for each new model.

Existing Solutions

ComfyUI-SDNQ is a vibe coded custom node that doesn't actually work

Other

No response

kanttouchthis avatar Dec 06 '25 20:12 kanttouchthis

SDNQ Models in the diffusers format for reference.

kanttouchthis avatar Dec 07 '25 01:12 kanttouchthis

Performance on RTX 3090 24GB & 64GB DRAM. SDNQ is using quantized_matmul=True and torch.compile with inductor backend. No torch.compile on ComfyUI due to OOM.

Method Speed VRAM
SDNQ uint4 flashattn 2.1 s/it 22 GB
ComfyUI fp8mixed sageattn 4.6 s/it 36 GB

kanttouchthis avatar Dec 07 '25 13:12 kanttouchthis