AQLM Request: AQLM quantization of the new Mixtral 8x22B

Request: AQLM quantization of the new Mixtral 8x22B

Open rationalism opened this issue 1 year ago • 2 comments

https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1/tree/main

This is way more powerful than the previous Mixtral, on essentially all benchmarks

Tried quantizing this myself, it seemed like it was working but I started running out of cloud credits :p

Apr 12 '24 09:04 rationalism

Hi, @rationalism. We are planning to quantize this model. Since new Mixtral is pretty large, this would take some time. Hopefully, the quantized model will be ready in a week or so.

Apr 12 '24 14:04 Godofnothing

This would be amazing! Though I'd prefer the Wizard 8x22B v0.1-Instruct =)

Also, any particular reason that Apple (Silicon/Metal/MPS, whatever you want to call it) isn't included as a possible platform? I suppose I'll give it a try. The requirements seem pretty platform agnostic -- and even then, I imagine if Torch gives some problems, I'm curious if it'd run with TVM or others. I suppose I'll try but doubt I could tweak/troubleshoot it if it comes to that

Edit: Also ended up seeing necessity for Triton or CUDA for inference which had been a no-go for Mac, but it turns out it was just a little compilation thing that was resolved a few weeks ago. Was able to successfully build/install it but I suppose its TBD as far as to how much of it runs still.

May 05 '24 19:05 BuildBackBuehler

This issue is stale because it has been open for 30 days with no activity.

Jun 06 '24 01:06 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

Jun 21 '24 01:06 github-actions[bot]

AQLM AQLM copied to clipboard

Request: AQLM quantization of the new Mixtral 8x22B

AQLM
AQLM copied to clipboard