transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Mixtral manual `head_dim`

Open wavy-jung opened this issue 1 year ago • 0 comments

Feature request

https://github.com/huggingface/transformers/blob/816f4424964c1a1631e303b663fc3d68f731e923/src/transformers/models/mixtral/modeling_mixtral.py#L284 head_dim in mixtral model is forced to have the value of hidden_size // num_heads. However, this it not the case in llama model or even in mistral model. So, it will be a good minor feature to support manual head_dim setting for mixtral model as well!

Motivation

  • manual head_dim in llama or mistral model

Your contribution

PR

wavy-jung avatar Oct 19 '24 03:10 wavy-jung