ollama
ollama copied to clipboard
customise number of experts in mixtral
Could you someone provide guidance or documentation on how to adjust the number of experts in mixtral? I'm particularly interested in understanding if there's a way to dynamically adjust this number based on the requirements of different tasks or scenarios.
I'm not sure what Ollama uses, but for the llama.cpp backend you can override a Key in the model with:
--override-kv KEY=TYPE:VALUE
advanced option to override model metadata by key. may be specified multiple times.
types: int, float, bool. example: --override-kv tokenizer.ggml.add_bos_token=bool:false
For example I override them using:
--override-kv llama.expert_used_count=int:3
But I think this is not yet supported by MODELFILE.
how can i do it with ollama? wondering if anyone can help
Figured it out yet?
Any update on this?