gptq topic

List gptq repositories

neural-compressor

2.0k
Stars
241
Forks
24
Watchers

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

ialacol

141
Stars
17
Forks
Watchers

🪶 Lightweight OpenAI drop-in replacement for Kubernetes

LLaMA-Cult-and-More

421
Stars
24
Forks
Watchers

Large Language Models for All, 🦙 Cult and More, Stay in touch !

xllm

357
Stars
20
Forks
Watchers

🦖 X—LLM: Cutting Edge & Easy LLM Finetuning

gptq_for_langchain

40
Stars
9
Forks
Watchers

A guide about how to use GPTQ models with langchain

llm-api

147
Stars
22
Forks
Watchers

Run any Large Language Model behind a unified API

zero-lora

30
Stars
3
Forks
Watchers

zero零训练llm调参

auto-round

81
Stars
9
Forks
Watchers

SOTA Weight-only Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"