AQLM 2bit 405b?

would be cool

Sep 18 '24 08:09 ewof

Hi, @ewof!

Thank you for your suggestion. There are several technical difficulties to make it fit into the GPUs for quantization , but it is definitely possible. We are already working on this. Unfortunately, we are a bit short on manpower, so I'm not sure when or if this will happen.

Sep 23 '24 10:09 Vahe1994

Hi @ewof and @Vahe1994,

No offense intended. AQLM is a fantastic project, and VPTQ has acknowledged your work in its acknowledgments.

I've successfully reproduced the VPTQ method and released several models on Hugging Face, including the 405B LLaMA 3.1, 70B LLaMA 3.1, and 72B LLaMA 3.2.

I welcome discussion and testing—let's explore these together!

Oct 02 '24 15:10 OpenSourceRonin

Hey @OpenSourceRonin, Thank you for letting us know. We are all for open-source and making models available to people, regardless of which quantization was used. So, of course, no offense taken. You did a great job! Thanks you for your work.

Oct 07 '24 10:10 Vahe1994

This issue is stale because it has been open for 30 days with no activity.

Nov 07 '24 01:11 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

Nov 21 '24 02:11 github-actions[bot]