llm-inference topic
genv
GPU environment and cluster management with LLM support
bespoke_automata
Bespoke Automata is a GUI and deployment pipline for making complex AI agents locally and offline
inferflow
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
Exa
Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and minimal learning curve.
aoororachain
Aoororachain is Ruby chain tool to work with LLMs
py-txi
A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.
TruthX
Code for ACL 2024 paper "TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space"
Morpheus
Morpheus - A Network For Powering Smart Agents - Compute + Code + Capital + Community
ray-educational-materials
This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.
llms-in-prod-workshop-2023
Deploy and Scale LLM-based applications