triton-inference-server topic
serving-codegen-gptj-triton
Serving Example of CodeGen-350M-Mono-GPTJ on Triton Inference Server with Docker and Kubernetes
Triton-TensorRT-Inference-CRAFT-pytorch
Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> ONNX -> TensorRT, Inference pipelines (TensorRT, Triton server -...
torchpipe
Serving Inside Pytorch
yolov8-triton
Provides an ensemble model to deploy a YoloV8 ONNX model to Triton
GenerativeAIExamples
Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
recsys_pipeline
Build Recommender System with PyTorch + Redis + Elasticsearch + Feast + Triton + Flask. Vector Recall, DeepFM Ranking and Web Application.
tritony
Tiny configuration for Triton Inference Server
Diff-VC
Diffusion Model for Voice Conversion
openai_trtllm
OpenAI compatible API for TensorRT LLM triton backend
tensorrt-triton-magface
Magface Triton Inferece Server Using Tensorrt