stable-diffusion-webui
stable-diffusion-webui copied to clipboard
[Feature Request]: Q-Diffusion
Is there an existing issue for this?
- [X] I have searched the existing issues and checked the recent builds/commits
What would your feature do ?
q-diffusion implentation would give a speedup (due to using INT8/INT4 [with good outputs too!]) and shrink file sizes down
tldr for paper: quantizing the diffusion models in a different way makes it not lose much precision, even at int4
Proposed workflow
either:
- Have it in a seperate tab (along with Loras)
- Click the tab to select
- Select the desired model or:
- Have it be seamless [looks like a normal model, loads like a normal model]
Additional information
as of now, the hard parts of the implentation is getting the inference part sped up; even the paper notes it in the conclusion detecting the model might be hard, as it can be mixed in all sorts of ways
note, most of this wont be possible until the next paper; as only simulated quantization is implemented. no quantization; no speedup/optimizations, only proof of concept
probably will be very easy or very hard to implement SD 2.x
note, most of this wont be possible until the next paper; as only simulated quantization is implemented. no quantization; no speedup/optimizations, only proof of concept
probably will be very easy or very hard to implement SD 2.x
Hi @yoinked-h this is an intersting blog about quant for SD https://developer.nvidia.com/blog/accelerate-generative-ai-inference-performance-with-nvidia-tensorrt-model-optimizer-now-publicly-available/ Nearly 2x Faster. Therefore, I would like to ask Webui has any interest to support the feature?