tensorrt-llm topic

List tensorrt-llm repositories

Awesome-LLM-Inference

1.5k
Stars
118
Forks
Watchers

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

optimum-benchmark

201
Stars
33
Forks
Watchers

A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.

WhisperLive

1.3k
Stars
171
Forks
Watchers

A nearly-live implementation of OpenAI's Whisper.

cortex

1.7k
Stars
82
Forks
11
Watchers

Drop-in, local AI alternative to the OpenAI stack. Multi-engine (llama.cpp, TensorRT-LLM). Powers 👋 Jan

WhisperS2T

202
Stars
19
Forks
Watchers

An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine

openai_trtllm

94
Stars
16
Forks
Watchers

OpenAI compatible API for TensorRT LLM triton backend

End-to-End-LLM

41
Stars
22
Forks
Watchers

This repository is an AI Bootcamp material that consist of a workflow for LLM