LLaMa2lang [Question] What framework is able to load this adapter and serve the resulting model as an OpenAI endpoint?

[Question] What framework is able to load this adapter and serve the resulting model as an OpenAI endpoint?

Open Mayorc1978 opened this issue 1 year ago • 1 comments

Problem description/Question As per title what framework, like ollama, lmstudio, vllm is able to use the language adapter and serve it as an OpenAI API compatible endpoint?

May 05 '24 10:05 Mayorc1978

Yes, you can merge all adapters with the base model (see inference script for example) and load it in any tool. Alternatively you can convert the models to GGUF and load those. Mind you though that for llama3 QLoRA there is an open bug when merging the adapters and then converting to GGUF

May 05 '24 14:05 ErikTromp

LLaMa2lang LLaMa2lang copied to clipboard

[Question] What framework is able to load this adapter and serve the resulting model as an OpenAI endpoint?

LLaMa2lang
LLaMa2lang copied to clipboard