inferentia topic

List inferentia repositories

vllm

38.4k
Stars
5.8k
Forks
320
Watchers

A high-throughput and memory-efficient inference and serving engine for LLMs

aphrodite-engine

1.0k
Stars
112
Forks
Watchers

Large-scale LLM inference engine

This Guidance demonstrates how to deploy a machine learning inference architecture on Amazon Elastic Kubernetes Service (Amazon EKS). It addresses the basic implementation requirements as well as ways...

foundation-model-benchmarking-tool

176
Stars
26
Forks
Watchers

Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.