mlc-llm
mlc-llm copied to clipboard
[Model Request] Mixtral-8x22B-Instruct-v0.1 π
βοΈ Request New Models
- Link to an existing implementation (e.g. Hugging Face/Github):
- https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1
- Is this model architecture supported by MLC-LLM? (the list of supported models) Y E S
Additional context
This is of course a wildly large model. I doubt my setup can quantize it without reaching its limits and therein lies the problem! Of course, once it is quant. to q8, q4, etc. it'll be manageable for (maybe) a majority of users (or at least a significant minority haha)
Mixtral 8x7B was wildly popular so I imagine this'll be a good opportunity to acquaint more people with MLC
And/or WizardLM-8x22B-Instruct π€ͺ
I think it should have been supported. cc @vinx13
Itβs supported. Are you requesting precompiled package?
@vinx13 Yep! I'd have already converted/compiled it myself if I could, but 141B and 300GB is a bit inaccessible for my prosumer setup
compile and convert does not require reading all data into RAM/VRAM. In other words, 500GB (original weight + converted weight) disk space is enough, no specific RAM or VRAM requirements.
However, at least 80GB VRAM is required if you'd like to run it :)
feel free to convert the model ourselves