text-generation-webui icon indicating copy to clipboard operation
text-generation-webui copied to clipboard

GPTQ GPT-J support

Open Dampfinchen opened this issue 2 years ago • 2 comments

Hey there.

Please take a look at this code: https://github.com/AlpinDale/gptq-gptj

Could you add 4bit quantization support for GPT-J? If everything is done, this would allow Pygmalion 6B to load in 4bits.

Much appreciated.

Important: This code is being tested right now, so it may or may not work. Feel free to provide feedback to AlpinDale.

Dampfinchen avatar Mar 19 '23 13:03 Dampfinchen

The code still needs testing before an attempt at implemention is made. I have not tested it yet - I'm not 100% sure I've got the layer names correctly. Theoretically it should be fine. Please submit issues (or PRs!) if you find it dysfunctional.

AlpinDale avatar Mar 19 '23 13:03 AlpinDale

Can confirm that the GPTQ implementation for the GPT-J 6B model (and any model fine-tuned off of it, such as Pygmalion 6B) seem to be working perfectly.

AlpinDale avatar Mar 19 '23 15:03 AlpinDale

This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.

github-actions[bot] avatar Apr 18 '23 23:04 github-actions[bot]