OpenLLM feat: GGML model support

feat: GGML model support

Open matthoffner opened this issue 2 years ago • 4 comments

Being able to use GGML models using ctransformers https://github.com/marella/ctransformers or llama.cpp https://github.com/abetlen/llama-cpp-python

CPU support for Starcoder and eventually Falcon models, and overall perf improvements.

No response

Jun 15 '23 01:06 matthoffner

I can't seem to run inference on M1 for starcoder and falcon

Jun 15 '23 01:06 aarnphm

Starcoder works for me with ctransformers but not llama.cpp. I have some examples here: https://huggingface.co/spaces/matthoffner/starchat-ggml

Llama.cpp has great M1 support with Metal now.

Edit: Falcon is now working with ctransformers

Jun 15 '23 02:06 matthoffner

Got it, will look after I finish the fine-tuning API

Jun 15 '23 04:06 aarnphm

Will track on the development of #178

Aug 18 '23 02:08 aarnphm