mlc-llm icon indicating copy to clipboard operation
mlc-llm copied to clipboard

[Model Request] Mixtral-8x22B-Instruct-v0.1 πŸ™

Open BuildBackBuehler opened this issue 1 year ago β€’ 5 comments

βš™οΈ Request New Models

  • Link to an existing implementation (e.g. Hugging Face/Github):
  • https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1
  • Is this model architecture supported by MLC-LLM? (the list of supported models) Y E S

Additional context

This is of course a wildly large model. I doubt my setup can quantize it without reaching its limits and therein lies the problem! Of course, once it is quant. to q8, q4, etc. it'll be manageable for (maybe) a majority of users (or at least a significant minority haha)

Mixtral 8x7B was wildly popular so I imagine this'll be a good opportunity to acquaint more people with MLC

BuildBackBuehler avatar Apr 18 '24 03:04 BuildBackBuehler

And/or WizardLM-8x22B-Instruct πŸ€ͺ

BuildBackBuehler avatar Apr 19 '24 00:04 BuildBackBuehler

I think it should have been supported. cc @vinx13

Hzfengsy avatar Apr 19 '24 05:04 Hzfengsy

It’s supported. Are you requesting precompiled package?

vinx13 avatar Apr 19 '24 05:04 vinx13

@vinx13 Yep! I'd have already converted/compiled it myself if I could, but 141B and 300GB is a bit inaccessible for my prosumer setup

BuildBackBuehler avatar Apr 19 '24 06:04 BuildBackBuehler

compile and convert does not require reading all data into RAM/VRAM. In other words, 500GB (original weight + converted weight) disk space is enough, no specific RAM or VRAM requirements.

However, at least 80GB VRAM is required if you'd like to run it :)

Hzfengsy avatar Apr 19 '24 06:04 Hzfengsy

feel free to convert the model ourselves

tqchen avatar May 11 '24 03:05 tqchen