sys_reading
sys_reading copied to clipboard
DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving
https://arxiv.org/pdf/2401.09670v1.pdf