tensorrt-llm topic
Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
optimum-benchmark
A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.
WhisperLive
A nearly-live implementation of OpenAI's Whisper.
cortex
Drop-in, local AI alternative to the OpenAI stack. Multi-engine (llama.cpp, TensorRT-LLM). Powers 👋 Jan
WhisperS2T
An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
openai_trtllm
OpenAI compatible API for TensorRT LLM triton backend
End-to-End-LLM
This repository is an AI Bootcamp material that consist of a workflow for LLM
Chat-With-RTX-python-api
Chat With RTX Python API