triton-inference-server topic

List triton-inference-server repositories

serving-codegen-gptj-triton

20
Stars
0
Forks
Watchers

Serving Example of CodeGen-350M-Mono-GPTJ on Triton Inference Server with Docker and Kubernetes

Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> ONNX -> TensorRT, Inference pipelines (TensorRT, Triton server -...

torchpipe

130
Stars
12
Forks
Watchers

An Alternative for Triton Inference Server. Boosting DL Service Throughput 1.5-4x by Ensemble Pipeline Serving with Concurrent CUDA Streams for PyTorch/LibTorch Frontend and TensorRT/CVCUDA, etc., Bac...

yolov8-triton

22
Stars
6
Forks
Watchers

Provides an ensemble model to deploy a YoloV8 ONNX model to Triton

GenerativeAIExamples

1.6k
Stars
263
Forks
36
Watchers

Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.

recsys_pipeline

33
Stars
7
Forks
Watchers

Build Recommender System with PyTorch + Redis + Elasticsearch + Feast + Triton + Flask. Vector Recall, DeepFM Ranking and Web Application.

tritony

39
Stars
1
Forks
Watchers

Tiny configuration for Triton Inference Server

openai_trtllm

94
Stars
16
Forks
Watchers

OpenAI compatible API for TensorRT LLM triton backend

tensorrt-triton-magface

15
Stars
3
Forks
Watchers

Magface Triton Inferece Server Using Tensorrt