neural-compressor issues

smooth quant pattern is incomplete at folding=True

for llama, 2 patterns have not been detected, mlp.down_proj->mlp.up_proj, .self_attn.o_proj->module.self_attn.v_proj for opt, self_attn.out_proj->self_attn.v_proj

wenhuach21

Add torch WOQ tuning

1

## Type of Change feature API changed or not: `get_woq_tuning_config` ## Description Add torch WOQ tuning https://github.com/intel/neural-compressor/blob/master/docs/source/quantization_weight_only.md#woq-algorithms-tuning ## How has this PR been tested? Pre-CI ## Dependency Change? None

yiliu30

modify op_type for set_local in 3.x API

4

## Type of Change modify op_type for set_local in 3.x API in ut and example ## Description according to changes in PR https://github.com/intel/neural-compressor/pull/1745 ## Expected Behavior & Potential Risk ##...

violetch24

Detect the number of sockets when needed

2

## Type of Change API changed or not: None ## How has this PR been tested? Pre-CI ## Dependency Change? None

yiliu30

bug fix

2.x

add mx quant

3

## Type of Change feature ## Description support mx quant ## Expected Behavior & Potential Risk the expected behavior that triggered by this PR ## How has this PR been...

mengniwang95

new feature

Continue quantization from history.snapshot

3

I was wondering if there is a way to resume qunatization from history.snapshot? I am using onnx and onnxrt_cuda_ep. I am can qunatize the model but before saving the model,...

oyazdanb

help wanted

aitce

Added SDXL smooth quant example

1

## Type of Change example API not changed ## Description Added SDXL smooth quant example. ## Expected Behavior & Potential Risk the expected behavior that triggered by this PR ##...

XinyuYe-Intel

'q_config' is needed when export an INT8 model

3

Hi, I want to convert and quantize Pytorch model to ONNX model. I refer to this example https://github.com/intel/neural-compressor/blob/master/examples/pytorch/image_recognition/torchvision_models/export/fx/main.py When calling export function, there is error "'q_config' is needed when export...

ZhangShuoAlreadyExists

aitce

Bump ejs from 3.1.9 to 3.1.10 in /neural_insights/gui

1

Bumps [ejs](https://github.com/mde/ejs) from 3.1.9 to 3.1.10. Release notes Sourced from ejs's releases. v3.1.10 Version 3.1.10 Commits d3f807d Version 3.1.10 9ee26dd Mocha TDD e469741 Basic pollution protection 715e950 Merge pull request...

dependabot[bot]

dependencies

javascript

Per tensor quantization in smoothquant

2

Hello community, I've tried the smoothquant flow on an OPT-125m model with the default setting. Unsurprisely the activations are quantized per tensor and weighs are per channel. According to the...

chensterliu

neural-compressor
neural-compressor copied to clipboard

Metadata

smooth quant pattern is incomplete at folding=True

Add torch WOQ tuning

modify op_type for set_local in 3.x API

Detect the number of sockets when needed

add mx quant

Continue quantization from history.snapshot

Added SDXL smooth quant example

'q_config' is needed when export an INT8 model

Bump ejs from 3.1.9 to 3.1.10 in /neural_insights/gui

Per tensor quantization in smoothquant

← Metadata

Owner

Metadata

neural-compressor neural-compressor copied to clipboard

Metadata

← Metadata

Owner

Metadata

neural-compressor
neural-compressor copied to clipboard