gptq topic
List
gptq repositories
neural-compressor
2.0k
Stars
241
Forks
24
Watchers
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
ialacol
141
Stars
17
Forks
Watchers
🪶 Lightweight OpenAI drop-in replacement for Kubernetes
LLaMA-Cult-and-More
421
Stars
24
Forks
Watchers
Large Language Models for All, 🦙 Cult and More, Stay in touch !
xllm
357
Stars
20
Forks
Watchers
🦖 X—LLM: Cutting Edge & Easy LLM Finetuning
gptq_for_langchain
40
Stars
9
Forks
Watchers
A guide about how to use GPTQ models with langchain
llm-api
147
Stars
22
Forks
Watchers
Run any Large Language Model behind a unified API
auto-round
81
Stars
9
Forks
Watchers
SOTA Weight-only Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"