gpt-fast icon indicating copy to clipboard operation
gpt-fast copied to clipboard

Mistral support

Open Nikita-Sherstnev opened this issue 1 year ago • 1 comments

Would it be hard to adapt this code for Mistral? I tried open orca version and set vocab_size in config to 32002. But shapes did not match:

File "/experiments/dev/nsherstnev/gpt-fast/scripts/convert_hf_checkpoint.py", line 61, in permute
    w.view(n_head, 2, config.head_dim // 2, dim)
RuntimeError: shape '[32, 2, 64, 4096]' is invalid for input of size 4194304

Nikita-Sherstnev avatar Dec 08 '23 10:12 Nikita-Sherstnev

you'll need to change some more configuration params (e.g. n_local_heads should be 8)

I'd copy them from here https://huggingface.co/docs/transformers/main/model_doc/mistral#transformers.MistralConfig

bwasti avatar Dec 11 '23 17:12 bwasti

Done in #116 The issue can be closed now.

Artyom17 avatar Feb 29 '24 23:02 Artyom17