chatllm.cpp icon indicating copy to clipboard operation
chatllm.cpp copied to clipboard

Support GGUF

Open trufae opened this issue 1 year ago • 1 comments
trafficstars

GGML is kind of not supported anymore and all models have moved to GGUF as a standard a year ago. Are there any plans to support it here? I'm wondering what are the limitations to handle sliding window in gguf compared to ggml if that's the problem

trufae avatar May 13 '24 22:05 trufae

chatllm.cpp is not down-stream app of llama.cpp, but an app based on ggml just as llama.cpp. It supports some models that are not supported by llama.cpp, I won't wait for llama.cpp to support it and then port to chatllm.cpp. So, I need to maintain my own set of supported models.

Further more, since the implementation of some models is developed independently from llama.cpp, some tensors (k/v/q specifically) might use different formats/shapes, which makes them incompatible with each other.

Anyway, it seems possible to support GGUF for some models (e.g. LlaMA models). I will look into it later.

foldl avatar May 14 '24 02:05 foldl

@foldl what about gguf support?)

lexasub avatar Feb 19 '25 19:02 lexasub

@lexasub Not ready yet. You can convert models with convert.py in one pass.

foldl avatar Feb 20 '25 11:02 foldl

We are not going to support GGUF. See ggmm.md

foldl avatar Mar 24 '25 11:03 foldl