BitNet
BitNet copied to clipboard
Distribute Computing?
I have multiple machines with many CPUs, but each machine's token is slow (2 t/s), is there any way to deploy BitNet distributely so that I can utilize all Idle CPUs to improve the token generation speed?
ref: https://www.reddit.com/r/LocalLLaMA/comments/1cyzi9e/llamacpp_now_supports_distributed_inference/
what type of machine are you using? maybe you need to figure out why it is so slow, ideally it should be over 20 t/s on a recent released machine.
actually I am running it on a raspberry pi 4b, maybe 4 of this machine can together hold 20 t/s