gpt-fast Mistral support

Mistral support

Open Nikita-Sherstnev opened this issue 1 year ago • 1 comments

Would it be hard to adapt this code for Mistral? I tried open orca version and set vocab_size in config to 32002. But shapes did not match:

File "/experiments/dev/nsherstnev/gpt-fast/scripts/convert_hf_checkpoint.py", line 61, in permute
    w.view(n_head, 2, config.head_dim // 2, dim)
RuntimeError: shape '[32, 2, 64, 4096]' is invalid for input of size 4194304

Dec 08 '23 10:12 Nikita-Sherstnev

you'll need to change some more configuration params (e.g. n_local_heads should be 8)

I'd copy them from here https://huggingface.co/docs/transformers/main/model_doc/mistral#transformers.MistralConfig

Dec 11 '23 17:12 bwasti

Done in #116 The issue can be closed now.

Feb 29 '24 23:02 Artyom17

gpt-fast gpt-fast copied to clipboard

Mistral support

gpt-fast
gpt-fast copied to clipboard