SQ
SQ
Consider a case that a pre-trained model is only hosted on three servers: the first one hosts blocks 1-4, the second hosts blocks 2-64, and the third hosts blocks 32-128....
Hi @borzunov, I started a private swarm with two GPU servers and built this web chat on another CPU node. When I input something, it raises errors like this `Oct...
Hi @MrYxJ , Thanks for open-sourcing this excellent repo. May I ask what I should do if I only want to compute the FLOPs of a transformer layer in a...