vllm topic

List vllm repositories

inference

2.9k
Stars
239
Forks
Watchers

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any...

BricksLLM

768
Stars
48
Forks
Watchers

🔒 Enterprise-grade API gateway that helps you monitor and impose cost or rate limits per API key. Get fine-grained access control and monitoring per user, application, or environment. Supports OpenAI...

ray_vllm_inference

31
Stars
4
Forks
Watchers

A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.

fastassert

26
Stars
0
Forks
Watchers

Dockerized LLM inference server with constrained output (JSON mode), built on top of vLLM and outlines. Faster, cheaper and without rate limits. Compare the quality and latency to your current LLM API...

DoRA

108
Stars
2
Forks
Watchers

Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"

llm-inference

43
Stars
8
Forks
Watchers

llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deployment, such as UI, RESTful API, auto-scaling, computing resource...

worker-vllm

162
Stars
58
Forks
Watchers

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.

happy_vllm

22
Stars
1
Forks
Watchers

A REST API for vLLM, production ready

ICE-PIXIU

15
Stars
0
Forks
Watchers

ICE-PIXIU:A Cross-Language Financial Megamodeling Framework

vllm-cn

31
Stars
1
Forks
Watchers

演示 vllm 对中文大语言模型的神奇效果