llm-awq
llm-awq copied to clipboard
GGUF export support / CPU inference
Hi, are the any plans to add a support for GGUF export for CPU inference? Or is there any other way to inference AWQ quantized model on CPU?
Thanks Tomek