neural-compressor topic

List neural-compressor repositories

optimum-benchmark

201

Stars

Forks

Watchers

A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.

huggingface

benchmark

diffusers

neural-compressor

onnxruntime

auto-round

Stars

Forks

Watchers

SOTA Weight-only Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"

intel

awq

gptq

int4

neural-compressor