sys_reading
sys_reading copied to clipboard
Fast Distributed Inference Serving for Large Language Models
https://arxiv.org/pdf/2305.05920.pdf
https://zhuanlan.zhihu.com/p/648759542