neural-compressor issues

PTQ with IPEX backend and XPU device is not working

3

Hi all, I have been trying to apply **post-training-quantization** to a custom vision model (pretrained vgg16 model) which I have already finetuned using "xpu" (Intel GPU Max Series). I have...

paguilomanas

FP4 encoding related

https://github.com/intel/neural-compressor/blob/4372a762585189accc65196e081a0a7a85f5af9e/neural_compressor/torch/algorithms/weight_only/utility.py#L69 FP4_BNB = [-12.0, -8.0, -6.0, -4.0, -3.0, -2.0, -0.0625, 0, 0.0625, 2.0, 3.0, 4.0, 6.0, 8.0, 12.0] FP4_E2M1 = [-6.0, -4.0, -3.0, -2.0, -1.5, -1.0, -0.0625, 0, 0.0625, 1.0,...

Tiantian-Han

add SDXL model example to INC 3.x

1

## Type of Change example ## Description add SDXL model example to INC 3.x ## Expected Behavior & Potential Risk ## How has this PR been tested? local test ##...

violetch24

how to extract int8 weights from quantized model

8

when loading the quantized model (smoothquant) with ``` from neural_compressor.utils.pytorch import load qmodel = load(qmodel_path, model_fp) ``` I got `RecursiveScriptModule(original_name=QuantizationDispatchModule)` I'd like to extract those quantized int8 weight matrix, together...

chensterliu

aitce

Support xpu device for 3.x ipex static

1

## Type of Change feature API not changed ## Description add support to xpu device for 3.x ipex static ## Expected Behavior & Potential Risk ## How has this PR...

violetch24

Remove deprecated modules

1

## Type of Change others ## Description 1. Remove deprecated modules 2. Bump version into v3.0 ## Expected Behavior & Potential Risk CI pass ## How has this PR been...

chensuyue

Is there any accuracy data related to FP4?

[https://github.com/intel/neural-compressor/blob/master/docs/source/validated_model_list.md/#pytorch-models-with-torch-201cpu-in-woq-mode](url) shows the accuracy of `int4` compared to `fp32`. Is there any data about 4-bit floating point numbers (e.g. `nf4`, `fp4`, etc.) and their performance data? THANKS

PhzCode

Support client layerwise setting

## Type of Change feature or bug fix or documentation or validation or others API changed or not ## Description Usage: ```bash None # default value, Autodetect (client is True)...

Kaihui-intel

Numba package requried for int-4 quantization

2

I am trying to run quantization for int4 examples from `examples/3.x_api/pytorch/nlp/huggingface_models/language-modeling/quantization/weight_only` but there is a package missing in the requirement.txt. `numba` package needs to be added to the requirements.txt. ```...

aneelaka-int

Update recipes & Bump version to 3.2

## Type of Change Update WOQ int4 recipes Bump INC version to 3.2 ## Description detail description ## Expected Behavior & Potential Risk the expected behavior that triggered by this...

XuehaoSun

neural-compressor
neural-compressor copied to clipboard

Metadata

PTQ with IPEX backend and XPU device is not working

FP4 encoding related

add SDXL model example to INC 3.x

how to extract int8 weights from quantized model

Support xpu device for 3.x ipex static

Remove deprecated modules

Is there any accuracy data related to FP4?

Support client layerwise setting

Numba package requried for int-4 quantization

Update recipes & Bump version to 3.2

← Metadata

Owner

Metadata

neural-compressor neural-compressor copied to clipboard

Metadata

← Metadata

Owner

Metadata

neural-compressor
neural-compressor copied to clipboard