llama.cpp
llama.cpp copied to clipboard
quantum K cache Q4_1 Q4_0 garbled output with Qwen-72b-Chat-iq3xxs / iq2xxs
q8_0 is ok.