inferentia topic
List
inferentia repositories
vllm
66.6k
Stars
12.3k
Forks
66.6k
Watchers
A high-throughput and memory-efficient inference and serving engine for LLMs
aphrodite-engine
1.0k
Stars
112
Forks
Watchers
Large-scale LLM inference engine
guidance-for-machine-learning-inference-on-aws
37
Stars
9
Forks
Watchers
This Guidance demonstrates how to deploy a machine learning inference architecture on Amazon Elastic Kubernetes Service (Amazon EKS). It addresses the basic implementation requirements as well as ways...
foundation-model-benchmarking-tool
254
Stars
42
Forks
254
Watchers
Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.