serving-infrastructure topic

List serving-infrastructure repositories
trafficstars

Efficiently-Serving-LLMs

17
Stars
4
Forks
17
Watchers

Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Pred...