neural-compressor issues

Support mixed `INT8` + `FP16` in one model

3

## Type of Change feature API changed or not ## Description - [x] Support convert unquantized `linear` into `fp16` - [ ] Extend the fp16 ops list to align with...

yiliu30

INC3.X

PyTorch

PT2E

[Not merge] Tests for quality

2

## Type of Change feature or bug fix or documentation or validation or others API changed or not ## Description detail description ## Expected Behavior & Potential Risk the expected...

yiliu30

won't merge

Enhance GPTQ `act_order` UT

1

## Type of Change UT ## Description detail description ## Expected Behavior & Potential Risk the expected behavior that triggered by this PR ## How has this PR been tested?...

Kaihui-intel

INC3.X

Support bf16/fp16 per layer convert for WOQ

1

## Type of Change feature ## Description detail description ## Expected Behavior & Potential Risk the expected behavior that triggered by this PR ## How has this PR been tested?...

Kaihui-intel

WIP

Add PT2E LLM example

## Type of Change feature ## Description detail description ## Expected Behavior & Potential Risk the expected behavior that triggered by this PR ## How has this PR been tested?...

Kaihui-intel

examples

Add example on llava quantization

1

## Type of Change Example

YIYANGCAI

examples

Support `auto_round` integration 3.x

3

## Type of Change feature ## Description - [x] update config params - [x] update `get_autoround_default_run_fn` - [x] update prepare/convert - [x] return paking model - [x] enhance ut -...

Kaihui-intel

INC3.X

PyTorch

Update lm-eval evaluate in ort llm example

1

## Type of Change bug fix API changed or not: no ## Description Update lm-eval evaluate in ort llm example ## How has this PR been tested? extention test ##...

yuwenzho

examples

ONNX Runtime

3.x SQ supports calib_func for auto-tune

1

## Type of Change sq supports calib_func for auto-tune, no need for dataloader ## Description Layer-wise & block-wise enable Add ut check auto-tune Check llm examples ## Expected Behavior &...

violetch24

fix 3.x IPEX examples failed with evaluate

2

## Type of Change 3.x example bug fix ## Description ## Expected Behavior & Potential Risk pass extension test ## How has this PR been tested? ## Dependency Change?

violetch24

neural-compressor
neural-compressor copied to clipboard

Metadata

Support mixed `INT8` + `FP16` in one model

[Not merge] Tests for quality

Enhance GPTQ `act_order` UT

Support bf16/fp16 per layer convert for WOQ

Add PT2E LLM example

Add example on llava quantization

Support `auto_round` integration 3.x

Update lm-eval evaluate in ort llm example

3.x SQ supports calib_func for auto-tune

fix 3.x IPEX examples failed with evaluate

← Metadata

Owner

Metadata

neural-compressor neural-compressor copied to clipboard

Metadata

← Metadata

Owner

Metadata

neural-compressor
neural-compressor copied to clipboard