llm-serving topic

List llm-serving repositories

ray

31.6k
Stars
5.3k
Forks
450
Watchers

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

mosec

712
Stars
48
Forks
Watchers

A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

skypilot

5.8k
Stars
404
Forks
27
Watchers

SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.

OpenLLM

9.0k
Stars
574
Forks
49
Watchers

Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.

vllm

20.3k
Stars
2.7k
Forks
190
Watchers

A high-throughput and memory-efficient inference and serving engine for LLMs

sugarcane-ai

46
Stars
15
Forks
Watchers

npm like package ecosystem for Prompts 🤖

superduperdb

4.4k
Stars
434
Forks
Watchers

🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalab...

ialacol

141
Stars
17
Forks
Watchers

🪶 Lightweight OpenAI drop-in replacement for Kubernetes

friendli-client

36
Stars
5
Forks
Watchers

Friendli: the fastest serving engine for generative AI