mlc-llm [Model Request] Mixtral-8x22B-Instruct-v0.1 🙏

[Model Request] Mixtral-8x22B-Instruct-v0.1 🙏

Open BuildBackBuehler opened this issue 1 year ago • 5 comments

⚙️ Request New Models

Link to an existing implementation (e.g. Hugging Face/Github):
https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1
Is this model architecture supported by MLC-LLM? (the list of supported models) Y E S

Additional context

This is of course a wildly large model. I doubt my setup can quantize it without reaching its limits and therein lies the problem! Of course, once it is quant. to q8, q4, etc. it'll be manageable for (maybe) a majority of users (or at least a significant minority haha)

Mixtral 8x7B was wildly popular so I imagine this'll be a good opportunity to acquaint more people with MLC

Apr 18 '24 03:04 BuildBackBuehler

And/or WizardLM-8x22B-Instruct 🤪

Apr 19 '24 00:04 BuildBackBuehler

I think it should have been supported. cc @vinx13

Apr 19 '24 05:04 Hzfengsy

It’s supported. Are you requesting precompiled package?

Apr 19 '24 05:04 vinx13

@vinx13 Yep! I'd have already converted/compiled it myself if I could, but 141B and 300GB is a bit inaccessible for my prosumer setup

Apr 19 '24 06:04 BuildBackBuehler

compile and convert does not require reading all data into RAM/VRAM. In other words, 500GB (original weight + converted weight) disk space is enough, no specific RAM or VRAM requirements.

However, at least 80GB VRAM is required if you'd like to run it :)

Apr 19 '24 06:04 Hzfengsy

feel free to convert the model ourselves

May 11 '24 03:05 tqchen

mlc-llm mlc-llm copied to clipboard

[Model Request] Mixtral-8x22B-Instruct-v0.1 🙏

⚙️ Request New Models

Additional context

mlc-llm
mlc-llm copied to clipboard