sys_reading
sys_reading copied to clipboard
Splitwise: Efficient Generative LLM Inference Using Phase Splitting
https://arxiv.org/pdf/2311.18677.pdf