fp4 topic

List fp4 repositories

2.2k

Stars

254

Forks

Watchers

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

350

Stars

Forks

350

Watchers

An innovative library for efficient LLM inference via low-bit quantization