Qwen2.5
Qwen2.5 copied to clipboard
72b-text-v1.5-q6_K 和 72b-text-v1.5-fp16 哪一个性能更强?
想请教一下,QWEN的72b-text-v1.5-q6_K 和 72b-text-v1.5-fp16 哪一个性能更强? 模型tag中的"_k"和"-q"分别代表什么,例如:
72b-chat-v1.5-q3_K_L
it is about the quantization, q6 you can regard it as 6 bit quantization, q2 you can regard it as 2 bit quantization. for sure, fp16 / bf16 should perform the best. check maxime's article and learn more about it. https://towardsdatascience.com/quantize-llama-models-with-ggml-and-llama-cpp-3612dfbcc172
it is about the quantization, q6 you can regard it as 6 bit quantization, q2 you can regard it as 2 bit quantization. for sure, fp16 / bf16 should perform the best. check maxime's article and learn more about it. https://towardsdatascience.com/quantize-llama-models-with-ggml-and-llama-cpp-3612dfbcc172这是关于量化的,q6你可以将其视为6位量化,q2你可以将其视为2位量化。当然,fp16 / bf16 应该表现最好。查看 maxime 的文章并了解更多信息。 https://towardsdatascience.com/quantize-llama-models-with-ggml-and-llama-cpp-3612dfbcc172
Thank you for your response, I've learned a lot.