tensorrt-llm topic

List tensorrt-llm repositories
trafficstars

Awesome-LLM-Inference

2.6k
Stars
175
Forks
Watchers

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

optimum-benchmark

240
Stars
43
Forks
Watchers

🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.

WhisperLive

1.9k
Stars
261
Forks
Watchers

A nearly-live implementation of OpenAI's Whisper.

cortex.cpp

1.9k
Stars
108
Forks
Watchers

Run and customize Local LLMs.

WhisperS2T

285
Stars
28
Forks
Watchers

An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine

openai_trtllm

152
Stars
25
Forks
Watchers

OpenAI compatible API for TensorRT LLM triton backend

End-to-End-LLM

58
Stars
28
Forks
Watchers

This repository is an AI Bootcamp material that consist of a workflow for LLM

lm-fly

16
Stars
4
Forks
Watchers

大模型推理框架加速,让 LLM 飞起来

grps

147
Stars
13
Forks
Watchers

【深度学习模型部署框架】支持tf/torch/trt/trtllm/vllm以及更多nn框架,支持dynamic batching、streaming模式,支持python/c++双语言,可限制,可拓展,高性能。帮助用户快速地将模型部署到线上,并通过http/rpc接...