llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Request Support for Mistral-8x22B

Open rankaiyx opened this issue 10 months ago • 15 comments

Feature Description

Support for Mixtral-8x22B

Mistral AI has just opened up a large model, Mistral 8x22B, with magnetic links again, with a model file size of 281.24 GB.

According to the name of the model, Mistral 8x22B is the Super Bowl version of "mixtral-8x7b", which was opened up last year, and the parameter size has more than tripled-it is made up of eight expert networks with 22 billion parameters (8 x 22B).

magnet:?xt=urn:btih:9238b09245d0d8cd915be09927769d5f7584c1c9&dn=mixtral-8x22b&tr=udp%3A%2F%http://2Fopen.demonii.com%3A1337%2Fannounce&tr=http%3A%2F%http://2Ftracker.opentrackr.org%3A1337%2Fannounce

Motivation

It should be a good model.

rankaiyx avatar Apr 10 '24 05:04 rankaiyx

+1

LiuChaoXD avatar Apr 10 '24 05:04 LiuChaoXD

It is not a Mistral Medium, it's a new model. Mistral Medium has different context length, etc. and Mistral Medium was leaked earlier. They said it's a brand new model.

anunknowperson avatar Apr 10 '24 10:04 anunknowperson

Did someone download the torrent ? Is it an HF model with modeling code or only weights inside without the architecture ?

phymbert avatar Apr 10 '24 10:04 phymbert

It is not a Mistral Medium, it's a new model. Mistral Medium has different context length, etc. and Mistral Medium was leaked earlier. They said it's a brand new model.

Okay, I'll change the title.

rankaiyx avatar Apr 10 '24 10:04 rankaiyx

@phymbert

Don't know if usefull but it's already up on huggingface. https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1

(You'll find many uploads).

simsi-andy avatar Apr 10 '24 10:04 simsi-andy

Don't know if usefull but it's already up on huggingface. https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1

It is useful, thanks, I did not notice they changed the org. Let's go then

phymbert avatar Apr 10 '24 11:04 phymbert

It just works. =D

https://huggingface.co/MaziyarPanahi/Mixtral-8x22B-v0.1-GGUF/tree/main

simsi-andy avatar Apr 10 '24 14:04 simsi-andy

Confirmed the IQ3_XS runs without changes.

digiwombat avatar Apr 10 '24 14:04 digiwombat

Is it really the exact same architecture though? Perhaps there are some subtle optimizations.

Dampfinchen avatar Apr 10 '24 16:04 Dampfinchen

It looks so, just bigger: https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1/blob/main/config.json https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1/blob/main/config.json

phymbert avatar Apr 10 '24 17:04 phymbert

Unfortunately, convert fails with Mixtral 8x22b instruct:

ValueError: Vocab size mismatch (model has 32768, but Mixtral-8x22B-Instruct-v0.1/tokenizer.json has 32769).

This off-by-little (sometimes 1, sometimes a few more) is actually a very common problem with older models that I quantize, but because they are older, I haven't bothered reporting it yet.

schmorp avatar Apr 18 '24 06:04 schmorp

#6740

stefanvarunix avatar Apr 19 '24 09:04 stefanvarunix

Unfortunately, convert fails with Mixtral 8x22b instruct:

ValueError: Vocab size mismatch (model has 32768, but Mixtral-8x22B-Instruct-v0.1/tokenizer.json has 32769).

This off-by-little (sometimes 1, sometimes a few more) is actually a very common problem with older models that I quantize, but because they are older, I haven't bothered reporting it yet.

That is because of a bug in the original mistral ai upload. Open the file tokenizer.json and change "TOOL_RESULT" into "TOOL_RESULTS" and the conversion should work.

https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1/discussions/6

tholin avatar Apr 19 '24 12:04 tholin

@tholin: indeed, thanks a lot!

schmorp avatar Apr 20 '24 02:04 schmorp

@tholin: while convert.py succeeds, it results in a 11GB output file, so something still doesn't work. (b2699)

Update: no longer happens with b2715

schmorp avatar Apr 20 '24 05:04 schmorp

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Jun 07 '24 01:06 github-actions[bot]