vllm topic
Awesome-LLM-Reasoning
Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought, Instruction-Tuning and Multimodality.
Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
swift
ms-swift: Use PEFT or Full-parameter to finetune 250+ LLMs or 25+ MLLMs
llm-vscode-inference-server
An endpoint server for efficiently serving quantized open-source LLMs for code.
OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (Support 70B+ full tuning & LoRA & Mixtral & KTO)
booster
Booster - open platform for serving LLM models
super-json-mode
Low latency JSON generation using LLMs ⚡️
llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a...
llm-atc
Fine-tuning and serving LLMs on any cloud
TinyLLM
Setup and run a local LLM and Chatbot using consumer grade hardware.