chatllm.cpp Support GGUF

Support GGUF

Open trufae opened this issue 1 year ago • 1 comments

trafficstars

GGML is kind of not supported anymore and all models have moved to GGUF as a standard a year ago. Are there any plans to support it here? I'm wondering what are the limitations to handle sliding window in gguf compared to ggml if that's the problem

May 13 '24 22:05 trufae

chatllm.cpp is not down-stream app of llama.cpp, but an app based on ggml just as llama.cpp. It supports some models that are not supported by llama.cpp, I won't wait for llama.cpp to support it and then port to chatllm.cpp. So, I need to maintain my own set of supported models.

Further more, since the implementation of some models is developed independently from llama.cpp, some tensors (k/v/q specifically) might use different formats/shapes, which makes them incompatible with each other.

Anyway, it seems possible to support GGUF for some models (e.g. LlaMA models). I will look into it later.

May 14 '24 02:05 foldl

@foldl what about gguf support?)

Feb 19 '25 19:02 lexasub

@lexasub Not ready yet. You can convert models with convert.py in one pass.

Feb 20 '25 11:02 foldl

We are not going to support GGUF. See ggmm.md

Mar 24 '25 11:03 foldl

chatllm.cpp chatllm.cpp copied to clipboard

Support GGUF

chatllm.cpp
chatllm.cpp copied to clipboard