AQLM
AQLM copied to clipboard
Request: AQLM quantization of the new Mixtral 8x22B
https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1/tree/main
This is way more powerful than the previous Mixtral, on essentially all benchmarks
Tried quantizing this myself, it seemed like it was working but I started running out of cloud credits :p
Hi, @rationalism. We are planning to quantize this model. Since new Mixtral is pretty large, this would take some time. Hopefully, the quantized model will be ready in a week or so.
This would be amazing! Though I'd prefer the Wizard 8x22B v0.1-Instruct =)
Also, any particular reason that Apple (Silicon/Metal/MPS, whatever you want to call it) isn't included as a possible platform? I suppose I'll give it a try. The requirements seem pretty platform agnostic -- and even then, I imagine if Torch gives some problems, I'm curious if it'd run with TVM or others. I suppose I'll try but doubt I could tweak/troubleshoot it if it comes to that
Edit: Also ended up seeing necessity for Triton or CUDA for inference which had been a no-go for Mac, but it turns out it was just a little compilation thing that was resolved a few weeks ago. Was able to successfully build/install it but I suppose its TBD as far as to how much of it runs still.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.