gptq topic

List gptq repositories

neural-compressor

2.2k
Stars
254
Forks
24
Watchers

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

ialacol

142
Stars
17
Forks
Watchers

🪶 Lightweight OpenAI drop-in replacement for Kubernetes

LLaMA-Cult-and-More

425
Stars
24
Forks
Watchers

Large Language Models for All, 🦙 Cult and More, Stay in touch !

xllm

376
Stars
21
Forks
Watchers

🦖 X—LLM: Cutting Edge & Easy LLM Finetuning

gptq_for_langchain

40
Stars
9
Forks
Watchers

A guide about how to use GPTQ models with langchain

llm-api

158
Stars
25
Forks
Watchers

Run any Large Language Model behind a unified API

zero-lora

30
Stars
3
Forks
Watchers

zero零训练llm调参

auto-round

222
Stars
19
Forks
Watchers

Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"