lmdeploy [Feature] Support for Mistral

Motivation

Mistral 7b is currently the best LLM <10b (and maybe even <30b), as per latest HELM evaluation (https://crfm.stanford.edu/helm/v0.4.0/#/leaderboard) and also scores very high on Open LLM Leaderboards on HuggingFace (https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).

In my personal experience, it is by far the easiest model to fine-tune to great results - it is small and the sliding window mechanism means that it is easy to fine-tune on 8k/16k context window without OOM.

Therefore, I would really appreciate it if support would be added for it. There are also rumors that Mistral will soon release another LLM as open source, and when that happens it will likely be the new SOTA #1, so I think having day-0 support would be great for lmdeploy.

Thank you very much for your amazing library!

Related resources

No response

Additional context

No response

Nov 23 '23 22:11 orendar

Hi @lzhangzz wanted to gently ping and see if there are any updates, mistral & mixtral models are the current SOTA and it would be great if lmdeploy could support them imo. Thanks!

Jan 09 '24 06:01 orendar

Currently, lmdeploy has no problem running mistral-7b. The plan is to add chat template after window attention is supported.

Jan 09 '24 08:01 lzhangzz

Okay, thank you! I did not know that because last time I tried it wasn't supported with TurboMind and it is not on the list of models. So right nowthe implementation is not utilizing the sliding window mechanism at all?

Jan 09 '24 08:01 orendar

Right. TurboMind hasn't implemented window attention.

Jan 19 '24 12:01 lvhan028

Hey, I see that there is now support for fp16 mistral, thank you! Would it be possible to also support int4 mistral as well?

Mar 17 '24 11:03 orendar

lmdeploy lmdeploy copied to clipboard

[Feature] Support for Mistral

Motivation

Related resources

Additional context

lmdeploy
lmdeploy copied to clipboard