inference-server topic

List inference-server repositories

truss

884
Stars
63
Forks
Watchers

The simplest way to serve AI/ML models in production

pipeless

662
Stars
31
Forks
Watchers

An open-source computer vision framework to build and deploy apps in minutes

inference

1.3k
Stars
113
Forks
Watchers

A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.

inference-benchmark

26
Stars
3
Forks
Watchers

Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)

fullstack-machine-learning-inference

30
Stars
12
Forks
Watchers

Fullstack machine learning inference template

onnxruntime-server

96
Stars
5
Forks
Watchers

ONNX Runtime Server: The ONNX Runtime Server is a server that provides TCP and HTTP/HTTPS REST APIs for ONNX inference.

Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> ONNX -> TensorRT, Inference pipelines (TensorRT, Triton server -...

friendli-client

40
Stars
8
Forks
Watchers

Friendli: the fastest serving engine for generative AI

wingman

40
Stars
2
Forks
Watchers

Wingman is the fastest and easiest way to run Llama models on your PC or Mac.

Simple-Inference-Server

24
Stars
1
Forks
Watchers

Inference Server Implementation from Scratch for Machine Learning Models