neural-compressor
neural-compressor copied to clipboard
Support LayerWise for RTN/GPTQ
Type of Change
feature
Description
usage
from neural_compressor.torch.algorithms.layer_wise import load_empty_model
model = load_empty_model("hf-internal-testing/tiny-random-GPTJForCausalLM")
quant_config = GPTQConfig(
use_layer_wise=True,
model_path="hf-internal-testing/tiny-random-GPTJForCausalLM"
)
model = prepare(model, quant_config)
run_fn(model)
model = convert(model)
Expected Behavior & Potential Risk
the expected behavior that triggered by this PR
How has this PR been tested?
Pre-CI
Dependency Change?
any library dependency introduced or removed
⚡ Required checks status: All passing 🟢
Groups summary
🟢 Code Scan Tests workflow
| Check ID | Status | Error details | |
|---|---|---|---|
| Code-Scan | success | ✅ | |
| Code-Scan (Bandit Code Scan Bandit) | success | ✅ | |
| Code-Scan (DocStyle Code Scan DocStyle) | success | ✅ | |
| Code-Scan (Pylint Code Scan Pylint) | success | ✅ |
These checks are required after the changes to neural_compressor/torch/algorithms/layer_wise/load.py, neural_compressor/torch/algorithms/layer_wise/utils.py, neural_compressor/torch/algorithms/weight_only/gptq.py, neural_compressor/torch/algorithms/weight_only/modules.py, neural_compressor/torch/algorithms/weight_only/rtn.py, neural_compressor/torch/quantization/algorithm_entry.py, neural_compressor/torch/quantization/config.py, neural_compressor/torch/utils/__init__.py, .azure-pipelines/scripts/codeScan/pylint/pylint.sh.
🟢 Model Tests 3x workflow
| Check ID | Status | Error details | |
|---|---|---|---|
| Model-Test-3x | success | ✅ | |
| Model-Test-3x (Generate Report GenerateReport) | success | ✅ | |
| Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4) | success | ✅ | |
| Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_bnb) | success | ✅ | |
| Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_ggml) | success | ✅ |
These checks are required after the changes to neural_compressor/torch/algorithms/layer_wise/load.py, neural_compressor/torch/algorithms/layer_wise/utils.py, neural_compressor/torch/algorithms/weight_only/gptq.py, neural_compressor/torch/algorithms/weight_only/modules.py, neural_compressor/torch/algorithms/weight_only/rtn.py, neural_compressor/torch/quantization/algorithm_entry.py, neural_compressor/torch/quantization/config.py, neural_compressor/torch/utils/__init__.py, requirements_pt.txt.
🟢 Unit Tests 3x-PyTorch workflow
| Check ID | Status | Error details | |
|---|---|---|---|
| UT-3x-Torch | success | ✅ | |
| UT-3x-Torch (Coverage Compare CollectDatafiles) | success | ✅ | |
| UT-3x-Torch (Unit Test 3x Torch Unit Test 3x Torch) | success | ✅ | |
| UT-3x-Torch (Unit Test 3x Torch baseline Unit Test 3x Torch baseline) | success | ✅ |
These checks are required after the changes to neural_compressor/torch/algorithms/layer_wise/load.py, neural_compressor/torch/algorithms/layer_wise/utils.py, neural_compressor/torch/algorithms/weight_only/gptq.py, neural_compressor/torch/algorithms/weight_only/modules.py, neural_compressor/torch/algorithms/weight_only/rtn.py, neural_compressor/torch/quantization/algorithm_entry.py, neural_compressor/torch/quantization/config.py, neural_compressor/torch/utils/__init__.py, test/3x/torch/algorithms/weight_only/test_woq_module.py, test/3x/torch/quantization/weight_only/test_gptq.py, test/3x/torch/quantization/weight_only/test_rtn.py, requirements_pt.txt.
Thank you for your contribution! 💜
Note This comment is automatically generated and will be updates every 180 seconds within the next 6 hours. If you have any other questions, contact chensuyue or XuehaoSun for help.