stable-diffusion.cpp T5xxl gguf support

I tried using t5xxl q3 gguf but it's not supported please add t5xxl gguf support.

Also does flux gguf get qauntized again? I tried loading it but it said it's a fp16 model? I couldn't load it until I added type q2_k also please make it where we don't have to keep reqauntizing the models every generation cuz flux takes 30minutes to qauntize.

Aug 26 '24 03:08 KintCark

I tried using t5xxl q3 gguf but it's not supported please add t5xxl gguf support.

It is supported, try the quants made specifically for sd.cpp: https://huggingface.co/Green-Sky/flux.1-schnell-GGUF/tree/main

Aug 26 '24 03:08 SkutteOleg

@pilot5657 what are those binaries supposed to "fix"?

Aug 26 '24 05:08 Green-Sky

@pilot5657 what are those binaries supposed to "fix"?

They're bots. I saw the same message from another account in an unrelated thread.

Aug 26 '24 06:08 phudtran

can moderator please remove these links to programs that are very likely viruses

Aug 26 '24 06:08 offbeat-stuff

I tried using t5xxl q3 gguf but it's not supported please add t5xxl gguf support.

Also does flux gguf get qauntized again? I tried loading it but it said it's a fp16 model? I couldn't load it until I added type q2_k also please make it where we don't have to keep reqauntizing the models every generation cuz flux takes 30minutes to qauntize.

You can use the 'convert' mode to quantize your model and save it out to a .gguf file that you can then load instead of using --type to quantize at model load time at the start of each run. Take a look in the documentation to see an example of how this works: https://github.com/leejet/stable-diffusion.cpp/blob/master/docs/quantization_and_gguf.md

Aug 26 '24 10:08 grauho