ggml [Feature request] Implement 8-bit GPT-J

[Feature request] Implement 8-bit GPT-J

Open pablogranolabar opened this issue 2 years ago • 0 comments

Results in ~11Gb weights vs. 16Gb, implemented in PyTorch now as load_in_8bit=True:

https://huggingface.co/hivemind/gpt-j-6B-8bit

Nov 13 '22 18:11 pablogranolabar