mms icon indicating copy to clipboard operation
mms copied to clipboard

AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)

AlpaServe

Repo of alpa's multi-model serving system.

This is the official implementation of our OSDI'23 paper: AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving.

To reproduce all the main results in our paper, please check the artifact folder and follow the instructions in it.