HunyuanDiT
HunyuanDiT copied to clipboard
Load DialogGen in 4bit to make usage on 24gb consumer GPUs possible.
This alters the DialogGen loading to use bitsandbytes 4bit quantization. This reduces overall memory usage and makes inference possible on 24gb consumer GPUs with DialogGen enabled.