[FEATURE] GGUF variant?
Feature request / 功能建议
Please create a GGUF variant since this is the defacto standard for running models locally.
Motivation / 动机
GGUF will make the model more easily accessible and follow a standard that fits running it with model platforms such as Ollama. All Llama derivatives should be able to work on Ollama without much effort.
Your contribution / 您的贡献
See how this one is built up: https://ollama.com/library/llava
Seems that they already has a plan for this. https://github.com/THUDM/CogVLM2/issues/2#issuecomment-2123935205
Ollama is fantastic! But it seems like we need to modify the C++ codes to support some new architecture. We are looking into it. We will appreciate it If you can give us some advices. @AdaptiveStep