YaLM-100B icon indicating copy to clipboard operation
YaLM-100B copied to clipboard

Provide pruned version for weaker hardware

Open CommanderTvis opened this issue 2 years ago • 2 comments

It would be really useful to have a pruned version of the model (like Balaboba) to launch on weaker video card setups.

CommanderTvis avatar Jan 08 '23 12:01 CommanderTvis

Also, quantization even to 4 bits may be possible, like it is successfully done for LLaMa. https://github.com/ggerganov/llama.cpp

CommanderTvis avatar Mar 20 '23 10:03 CommanderTvis

+1 also this distribution technique might be very much applicable here: https://petals.ml

blokhin avatar Mar 20 '23 16:03 blokhin