neural-compressor issues

R1 woq

## Type of Change feature or bug fix or documentation or validation or others API changed or not ## Description detail description ## Expected Behavior & Potential Risk the expected...

yiliu30

won't merge

self-review

neural_compressor.experimental does not exit - update Examples

1

Hi, I am trying to reproduce some of the examples, but it looks like they are outdated. I am not able to load: _from neural_compressor.experimental import Quantization, common_ which appears...

evgeni-lh

Add a feature of UniformQDQ - support CV/NLP model's OPs, includingConv, DepthwiseConv2D, MatMul, etc. Additional op support be added upon request.

5

## Type of Change A feature of UniformQDQ ## Description A feature of UniformQDQ - support CV/NLP model's OPs, includingConv, DepthwiseConv2D, MatMul, etc. Additional op support be added upon request....

qgao007

Add support for optimum-habana deepseek v3/r1 fp8 quantization

## Type of Change # What does this PR do? Support FP8 static quantization for optimum-habana deepseek v3/r1 models using Intel Neural Compressor (INC) This feature needs changes in: -...

skavulya

[SW-219134]: Remove legacy import in init.

2

To fix sw-219134. Remove legacy import in init.

feng-intel

Recover PatchedVLLMKVCache

mengniwang95

No module named 'lm_eval.loggers'

2

As examples/3.x_api/pytorch/nlp/huggingface_models/language-modeling/quantization/mx_quant/Readme suggests, I run `python run_clm_no_trainer.py --model ./Qwen2-1.5B-Instruct --quantize --accuracy --tasks lambada_openai --w_dtype fp4 --woq` But it returns error: ``` 2025-02-12 13:21:16 [WARNING][auto_accelerator.py:418] Auto detect accelerator: CPU_Accelerator. 2025-02-12 13:21:16...

wjxh950210

Support Visual Language Model?

Hi, I wonder if neural compressor supports visual language model that accept visual image and text as inputs?

billamiable

FlexAttention?

1

Is it compatible flexattention from pytorch 2.6.0?

johnnynunez

Mx FP Quantization About Subnorm

1

When quantizing Mx fp, the quantization scales of subnormal and normal values should be different. Why does L394 clip to min_exp? I understand that it should clip to 1. Looking...

Jzz24

neural-compressor
neural-compressor copied to clipboard

Metadata

R1 woq

neural_compressor.experimental does not exit - update Examples

Add a feature of UniformQDQ - support CV/NLP model's OPs, includingConv, DepthwiseConv2D, MatMul, etc. Additional op support be added upon request.

Add support for optimum-habana deepseek v3/r1 fp8 quantization

[SW-219134]: Remove legacy import in init.

Recover PatchedVLLMKVCache

No module named 'lm_eval.loggers'

Support Visual Language Model?

FlexAttention?

Mx FP Quantization About Subnorm

← Metadata

Owner

Metadata

neural-compressor neural-compressor copied to clipboard

Metadata

← Metadata

Owner

Metadata

neural-compressor
neural-compressor copied to clipboard