large-scale-deployment topic
List
large-scale-deployment repositories
Efficiently-Serving-LLMs
17
Stars
4
Forks
17
Watchers
Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Pred...