inferentia topic
List
inferentia repositories
vllm
38.4k
Stars
5.8k
Forks
320
Watchers
A high-throughput and memory-efficient inference and serving engine for LLMs
aphrodite-engine
1.0k
Stars
112
Forks
Watchers
Large-scale LLM inference engine
guidance-for-machine-learning-inference-on-aws
37
Stars
9
Forks
Watchers
This Guidance demonstrates how to deploy a machine learning inference architecture on Amazon Elastic Kubernetes Service (Amazon EKS). It addresses the basic implementation requirements as well as ways...
foundation-model-benchmarking-tool
176
Stars
26
Forks
Watchers
Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.