ollama icon indicating copy to clipboard operation
ollama copied to clipboard

customise number of experts in mixtral

Open scienlabs opened this issue 2 years ago • 3 comments

Could you someone provide guidance or documentation on how to adjust the number of experts in mixtral? I'm particularly interested in understanding if there's a way to dynamically adjust this number based on the requirements of different tasks or scenarios.

scienlabs avatar Dec 15 '23 20:12 scienlabs

I'm not sure what Ollama uses, but for the llama.cpp backend you can override a Key in the model with:

--override-kv KEY=TYPE:VALUE
                        advanced option to override model metadata by key. may be specified multiple times.
                        types: int, float, bool. example: --override-kv tokenizer.ggml.add_bos_token=bool:false

For example I override them using:

--override-kv llama.expert_used_count=int:3

But I think this is not yet supported by MODELFILE.

RafaAguilar avatar Dec 19 '23 19:12 RafaAguilar

how can i do it with ollama? wondering if anyone can help

scienlabs avatar Dec 27 '23 17:12 scienlabs

Figured it out yet?

PLK2 avatar May 08 '24 15:05 PLK2

Any update on this?

ColumbusAI avatar Aug 02 '24 03:08 ColumbusAI