LLaMa2lang
LLaMa2lang copied to clipboard
[Question] What framework is able to load this adapter and serve the resulting model as an OpenAI endpoint?
Problem description/Question As per title what framework, like ollama, lmstudio, vllm is able to use the language adapter and serve it as an OpenAI API compatible endpoint?
Yes, you can merge all adapters with the base model (see inference script for example) and load it in any tool. Alternatively you can convert the models to GGUF and load those. Mind you though that for llama3 QLoRA there is an open bug when merging the adapters and then converting to GGUF