vllm topic

List vllm repositories

inference

8.8k
Stars
767
Forks
8.8k
Watchers

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-ready...

BricksLLM

880
Stars
59
Forks
Watchers

🔒 Enterprise-grade API gateway that helps you monitor and impose cost or rate limits per API key. Get fine-grained access control and monitoring per user, application, or environment. Supports OpenAI...

ray_vllm_inference

49
Stars
4
Forks
Watchers

A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.

fastassert

28
Stars
0
Forks
Watchers

Dockerized LLM inference server with constrained output (JSON mode), built on top of vLLM and outlines. Faster, cheaper and without rate limits. Compare the quality and latency to your current LLM API...

DoRA

124
Stars
4
Forks
124
Watchers

Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"

llm-inference

69
Stars
17
Forks
Watchers

llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deployment, such as UI, RESTful API, auto-scaling, computing resource...

worker-vllm

229
Stars
87
Forks
Watchers

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.

happy_vllm

22
Stars
1
Forks
Watchers

A REST API for vLLM, production ready

ICE-PIXIU

15
Stars
1
Forks
Watchers

ICE-PIXIU:A Cross-Language Financial Megamodeling Framework

vllm-cn

31
Stars
1
Forks
Watchers

演示 vllm 对中文大语言模型的神奇效果