text-generation-webui GPTQ GPT-J support

Hey there.

Please take a look at this code: https://github.com/AlpinDale/gptq-gptj

Could you add 4bit quantization support for GPT-J? If everything is done, this would allow Pygmalion 6B to load in 4bits.

Much appreciated.

Important: This code is being tested right now, so it may or may not work. Feel free to provide feedback to AlpinDale.

Mar 19 '23 13:03 Dampfinchen

The code still needs testing before an attempt at implemention is made. I have not tested it yet - I'm not 100% sure I've got the layer names correctly. Theoretically it should be fine. Please submit issues (or PRs!) if you find it dysfunctional.

Mar 19 '23 13:03 AlpinDale

Can confirm that the GPTQ implementation for the GPT-J 6B model (and any model fine-tuned off of it, such as Pygmalion 6B) seem to be working perfectly.

Mar 19 '23 15:03 AlpinDale

This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.

Apr 18 '23 23:04 github-actions[bot]