inference-server topics

A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.

roboflow

deployment

docker

python

server

inference-benchmark

26

Stars

3

Forks

Watchers

Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)

tensorchord

benchmark

inference-server

llm

stable-diffusion

fullstack-machine-learning-inference

30

Stars

12

Forks

Watchers

Fullstack machine learning inference template

haicheviet

aws

cloudformation

fastapi

full-stack

onnxruntime-server

96

Stars

5

Forks

Watchers

ONNX Runtime Server: The ONNX Runtime Server is a server that provides TCP and HTTP/HTTPS REST APIs for ONNX inference.

kibae

ai

contributions-welcome

cuda

deep-learning

Triton-TensorRT-Inference-CRAFT-pytorch

31

Stars

6

Forks

Watchers

Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> ONNX -> TensorRT, Inference pipelines (TensorRT, Triton server -...

k9ele7en

inference

inference-engine

inference-server

nvidia-docker