stable-diffusion.cpp icon indicating copy to clipboard operation
stable-diffusion.cpp copied to clipboard

T5xxl gguf support

Open KintCark opened this issue 1 year ago • 5 comments

I tried using t5xxl q3 gguf but it's not supported please add t5xxl gguf support.

Also does flux gguf get qauntized again? I tried loading it but it said it's a fp16 model? I couldn't load it until I added type q2_k also please make it where we don't have to keep reqauntizing the models every generation cuz flux takes 30minutes to qauntize.

KintCark avatar Aug 26 '24 03:08 KintCark

I tried using t5xxl q3 gguf but it's not supported please add t5xxl gguf support.

It is supported, try the quants made specifically for sd.cpp: https://huggingface.co/Green-Sky/flux.1-schnell-GGUF/tree/main

SkutteOleg avatar Aug 26 '24 03:08 SkutteOleg

@pilot5657 what are those binaries supposed to "fix"?

Green-Sky avatar Aug 26 '24 05:08 Green-Sky

@pilot5657 what are those binaries supposed to "fix"?

They're bots. I saw the same message from another account in an unrelated thread.

phudtran avatar Aug 26 '24 06:08 phudtran

can moderator please remove these links to programs that are very likely viruses

offbeat-stuff avatar Aug 26 '24 06:08 offbeat-stuff

I tried using t5xxl q3 gguf but it's not supported please add t5xxl gguf support.

Also does flux gguf get qauntized again? I tried loading it but it said it's a fp16 model? I couldn't load it until I added type q2_k also please make it where we don't have to keep reqauntizing the models every generation cuz flux takes 30minutes to qauntize.

You can use the 'convert' mode to quantize your model and save it out to a .gguf file that you can then load instead of using --type to quantize at model load time at the start of each run. Take a look in the documentation to see an example of how this works: https://github.com/leejet/stable-diffusion.cpp/blob/master/docs/quantization_and_gguf.md

grauho avatar Aug 26 '24 10:08 grauho