vllm topic
LightCompress
[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLM, VLM, and video generation models.
vidur
A large-scale simulation framework for LLM inference
harbor
Effortlessly run LLM backends, APIs, frontends, and services with one command.
prometheus-eval
Evaluate your LLM's response with Prometheus and GPT4 💯
ramalama
The goal of ramalama is to make working with AI boring.
llmaz
☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!
nextjs-vllm-ui
Fully-featured, beautiful web interface for vLLM - built with NextJS.
grps
【深度学习模型部署框架】支持tf/torch/trt/trtllm/vllm以及更多nn框架,支持dynamic batching、streaming模式,支持python/c++双语言,可限制,可拓展,高性能。帮助用户快速地将模型部署到线上,并通过http/rpc接...