Lora models / Lora training
Would it be possible to somehow add lora support just like embedding support? That way we can load lora models to be used in the execution of text generation models in order to fine tune their output?
Also wondering if it would be possible to add lora training so we could train a lora directly via ollama to run on top of models
I have noticed loras seem to really improve the results of models so it would be nice to be able to directly support loading them on top of models within ollama
Ollama does support LoRA, add it as an adapter in a Modelfile, read more about it here: https://github.com/ollama/ollama/blob/798b107f19ed832d33a6816f11363b42888aaed3/docs/modelfile.md#adapter
Thank you for the response, i will play around with it 👍
Let's close this in favour of #4618 . You can use Ollama to load GGLA based LoRA adapters (ggla being the old "gguf" file format specifically for LoRAs), but it's practically impossible to make them work right now.
I have been working on a change to allow importing LoRAs from MLX (#5524) which right now is just in NPZ format, although adding safetensors shouldn't be hard. There are still a few things to sort out:
- GGLA is missing some options which we need to make this work well w/ MLX/Unsloth
- There is an upstream change in Llama.cpp which will deprecate GGLAs altogether in favour of GGUF (which is welcomed because GGLA has some real shortcomings)