fp4 topic

List fp4 repositories

neural-compressor

2.2k
Stars
254
Forks
24
Watchers

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

neural-speed

346
Stars
38
Forks
Watchers

An innovative library for efficient LLM inference via low-bit quantization