neural-compressor topic
List
neural-compressor repositories
optimum-benchmark
240
Stars
43
Forks
Watchers
🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.
auto-round
222
Stars
19
Forks
Watchers
Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"