chatglm.cpp 可以运行llama.cpp量化后的模型吗？

可以运行llama.cpp量化后的模型吗？

Open user-ZJ opened this issue 2 years ago • 1 comments

在llamacpp上使用GPU运行，GPU利用率比chatglmcpp低，想用chatglmcpp运行llama模型

Oct 30 '23 08:10 user-ZJ

llama.cpp and chatglm.cpp share a same interface package called ggml. Maybe you should check the quantization parameter and CUDA configuration.

Nov 02 '23 14:11 leonsama