neural-compressor
neural-compressor copied to clipboard
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
@airMeng Hi, can you provide some more detailed information about the sparsity method on this [page ](https://github.com/intel/neural-compressor/blob/master/docs/sparsity.md)? what does the sparsity pattern mean? (ex, 2x1, 16x4 )
## Type of Change bug fix API not changed ## Description detail description JIRA ticket: ILITV-2517 ## Expected Behavior & Potential Risk The ITEX quantization can run the device user...
Signed-off-by: wenhuach21 ## Type of Change feature or bug fix or documentation or validation or others API changed or not ## Description detail description JIRA ticket: xxx ## Expected Behavior...
## Type of Change fix conda install deps concflict ## Description detail description JIRA ticket: ## Expected Behavior & Potential Risk conda install success ## How has this PR been...
Signed-off-by: Mengni Wang ## Type of Change bug fix ## Description detail description JIRA ticket: https://jira.devtools.intel.com/browse/ILITV-2529
## Type of Change example ## Description update ONNXRT example for new API JIRA ticket: [ILITV-2468](https://jira.devtools.intel.com/browse/ILITV-2468) ## How has this PR been tested? extension test on onnx models ## Dependency...
## Type of Change - documentation - API changed or not: None ## Description detail description JIRA ticket: [ILITV-2572](https://jira.devtools.intel.com/browse/ILITV-2572) ## Expected Behavior & Potential Risk None ## How has this...
Hi. When we use inc to convert [HBONet](https://github.com/d-li14/HBONet.git), we get an exception. Hope you can help me to fix it. Version info: neural-compressor: 1.13.1 torch: 1.12.1 We use inc like...