neural-compressor icon indicating copy to clipboard operation
neural-compressor copied to clipboard

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Results 155 neural-compressor issues
Sort by recently updated
recently updated
newest added

### Remove 1.x related code Some folders and files include both 1.x and 2.x UTs. I've removed the 1.x UTs if they import something from `experimental` or `conf`. Please help...

## Type of Change feature ## Description - [x] implement `incbench` command as entrypoint for ease-of-use benchmark - [x] automatically check numa/socket info and dump it with table for ease-of-understand...

## Type of Change feature ## Description add some new features for layer-wise quant, include get_weight, get_bias, update, and save/load. Make it more easy to use, like a normal model....

## Type of Change bug fix ## Description fix bf16 symbolic_trace bug, 1. cause abnormal recursive calling. 2. missing necessary attributes By moving BF16 fallback ahead of quantization and removing...

## Type of Change feature or bug fix or documentation or validation or others API changed or not ## Description detail description ## Expected Behavior & Potential Risk the expected...

## Type of Change feature ## Description usage ```python from neural_compressor.torch.algorithms.layer_wise import load_empty_model model = load_empty_model("hf-internal-testing/tiny-random-GPTJForCausalLM") quant_config = GPTQConfig( use_layer_wise=True, model_path="hf-internal-testing/tiny-random-GPTJForCausalLM" ) model = prepare(model, quant_config) run_fn(model) model = convert(model)...

INC3.X
PyTorch

updates: - [github.com/psf/black.git: 24.3.0 → 24.4.2](https://github.com/psf/black.git/compare/24.3.0...24.4.2) - [github.com/asottile/blacken-docs: 1.16.0 → 1.18.0](https://github.com/asottile/blacken-docs/compare/1.16.0...1.18.0) - [github.com/codespell-project/codespell: v2.2.6 → v2.3.0](https://github.com/codespell-project/codespell/compare/v2.2.6...v2.3.0) - [github.com/astral-sh/ruff-pre-commit: v0.3.5 → v0.5.0](https://github.com/astral-sh/ruff-pre-commit/compare/v0.3.5...v0.5.0)

## Type of Change ## Description Port auto-detect absorbs layers for TEQ ```bash pytest -sv test/3x/torch/algorithms/weight_only/test_teq_quantizer.py -k test_teq_detect_absorb_layers ``` ## Expected Behavior & Potential Risk PRE-CI ## Dependency Change? None

INC3.X
PyTorch

## Type of Change feature API changed or not: no ## Description Use different WeightOnlyLinear module according to device. - Abstract WeightOnlyLinear class. Inherited class INCWeightOnlyLinear and HPUWeighOnlyLinear - Load...

WIP
PyTorch

## Type of Change Update example for Pytorch 3x mixed precision ## Description - [x] add Torchvision resnet18 model as an example - [x] update document ## How has this...

examples
INC3.X
PyTorch