vllm topic

List vllm repositories

trustgraph

804
Stars
65
Forks
804
Watchers

Eliminate hallucinations from your AI agents.

kvcached

682
Stars
67
Forks
682
Watchers

Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond

FlashTTS

557
Stars
72
Forks
557
Watchers

基于SparkTTS、OrpheusTTS等模型,提供高质量中文语音合成与声音克隆服务。

InferenceMAX

383
Stars
55
Forks
383
Watchers

Open Source Continuous Inference Benchmarking - GB200 NVL72 vs MI355X vs B200 vs H200 vs MI325X & soon™ TPUv6e/v7/Trainium2/3/GB300 NVL72 - DeepSeek 670B MoE, GPTOSS

olla

119
Stars
12
Forks
119
Watchers

High-performance lightweight proxy and load balancer for LLM infrastructure. Intelligent routing, automatic failover and unified model discovery across local and remote inference backends.

stopwatch

45
Stars
4
Forks
45
Watchers

A tool for benchmarking LLMs on Modal

arks

43
Stars
5
Forks
43
Watchers

Arks is a cloud-native inference framework running on Kubernetes

blackbird

39
Stars
4
Forks
39
Watchers

A high-performance RDMA distributed file system for fast LLM Inference and GPU Training

deepseek-v3-r1-deploy-and-benchmarks

17
Stars
3
Forks
17
Watchers

DeepSeek-V3, R1 671B on 8xH100 Throughput Benchmarks

UltraRAG

2.2k
Stars
195
Forks
2.2k
Watchers

UltraRAG v2: A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines