lmdeploy
lmdeploy copied to clipboard
[Feature] Support for Mistral
Motivation
Mistral 7b is currently the best LLM <10b (and maybe even <30b), as per latest HELM evaluation (https://crfm.stanford.edu/helm/v0.4.0/#/leaderboard) and also scores very high on Open LLM Leaderboards on HuggingFace (https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
In my personal experience, it is by far the easiest model to fine-tune to great results - it is small and the sliding window mechanism means that it is easy to fine-tune on 8k/16k context window without OOM.
Therefore, I would really appreciate it if support would be added for it. There are also rumors that Mistral will soon release another LLM as open source, and when that happens it will likely be the new SOTA #1, so I think having day-0 support would be great for lmdeploy.
Thank you very much for your amazing library!
Related resources
No response
Additional context
No response
Hi @lzhangzz wanted to gently ping and see if there are any updates, mistral & mixtral models are the current SOTA and it would be great if lmdeploy could support them imo. Thanks!
Currently, lmdeploy has no problem running mistral-7b. The plan is to add chat template after window attention is supported.
Okay, thank you! I did not know that because last time I tried it wasn't supported with TurboMind and it is not on the list of models. So right nowthe implementation is not utilizing the sliding window mechanism at all?
Right. TurboMind hasn't implemented window attention.
Hey, I see that there is now support for fp16 mistral, thank you! Would it be possible to also support int4 mistral as well?