transformers Mixtral manual `head

Mixtral manual `head_dim`

Open wavy-jung opened this issue 1 year ago • 0 comments

Feature request

https://github.com/huggingface/transformers/blob/816f4424964c1a1631e303b663fc3d68f731e923/src/transformers/models/mixtral/modeling_mixtral.py#L284 head_dim in mixtral model is forced to have the value of hidden_size // num_heads. However, this it not the case in llama model or even in mistral model. So, it will be a good minor feature to support manual head_dim setting for mixtral model as well!

Motivation

manual head_dim in llama or mistral model

Your contribution

Oct 19 '24 03:10 wavy-jung

transformers transformers copied to clipboard

Mixtral manual `head_dim`

Feature request

Motivation

Your contribution

transformers
transformers copied to clipboard