YaLM-100B Provide pruned version for weaker hardware

Provide pruned version for weaker hardware

Open CommanderTvis opened this issue 2 years ago • 2 comments

It would be really useful to have a pruned version of the model (like Balaboba) to launch on weaker video card setups.

Jan 08 '23 12:01 CommanderTvis

Also, quantization even to 4 bits may be possible, like it is successfully done for LLaMa. https://github.com/ggerganov/llama.cpp

Mar 20 '23 10:03 CommanderTvis

+1 also this distribution technique might be very much applicable here: https://petals.ml

Mar 20 '23 16:03 blokhin