llm-serving topics

ray

31.6k

Stars

5.3k

Forks

450

Watchers

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

ray-project

automl

data-science

deep-learning

deployment

mosec

712

Stars

48

Forks

Watchers

A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

gpu

skypilot

5.8k

Stars

404

Forks

27

Watchers

SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.

ray-llm

1.2k

Stars

84

Forks

8

Watchers

RayLLM - LLMs on Ray

ray-project

distributed-systems

large-language-models

ray

serving

OpenLLM

9.0k

Stars

574

Forks

49

Watchers

Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.

ai

vllm

20.3k

Stars

2.7k

Forks

190

Watchers

A high-throughput and memory-efficient inference and serving engine for LLMs

vllm-project

gpt

inference

llm

llm-serving

sugarcane-ai

46

Stars

15

Forks

Watchers

npm like package ecosystem for Prompts 🤖

sugarcane-ai

ai

framework

llm

llm-chain

🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalab...

SuperDuperDB

ai

chatbot

data

database