neural-compressor issues

Will you support Intel Arc?

2

I’m curious if you will support Arc, neural compressor would particularly benefit those platforms! Thanks!

nathanodle

[RFC] Porting INC SmoothQuant recipes to IPEX autotune API

2

https://github.com/intel-innersource/frameworks.ai.pytorch.ipex-cpu/issues/2404

xin3he

How to parallelize a model with fake quantization nodes?

5

If a quantized model contains fake quantization nodes, how can such a model be parallelized, and how can its accuracy be validated on a dataset?

sheegao

Bug in ORTSmoothQuant._adjust_weights()

5

I think there is a bug in [ORTSmoothQuant._adjust_weights()](https://github.com/intel/neural-compressor/blob/de385a432acff1bc0384086c8c35b3442b860fc8/neural_compressor/adaptor/ox_utils/smooth_quant.py#L669). Part of this method presented below: ``` def _adjust_weights(self, scales): """Adjust the weights with scale. Args: scales (dict): The input scales """...

pavelkochnev

[RFC] HuggingFace compabtible yet flexible WeightOnlyQuantization format for IPEX and INC

14

This RFC is to propose a Hugging Face-compatible yet flexible Weight Only Quantization (WOQ) format in INC, and then the model quantized by INC can be loaded by IPEX for...

ftian1

Is pruning of quantized models supported?

3

Hello, I'm attempting to train a model for a micro-controller that only supports 8-bit precision or lower. This works perfectly when training using your `QuantizationAwareTrainingConfig`. In addition to this we...

thomasave

How to set the pruned weight blocks as a same learnable value?

1

I want to use the sparsity feature of the neural-compressor. I want to prune the model weights using block-wise granularity. Unlike traditional pruning approaches that zero out pruned weights, I...

hobbitlzy

support conv1d in quantization algorithms

1

Several models, such as LaMini-GPT, are utilizing this layer, but unfortunately, most of our algorithms do not currently support it. W8A8: SQ weight-only: RTN, TEQ better support tranformers.conv1d and torch.conv1d...

wenhuach21

Unable to save model in saved_results directory

3

Hello, I have been attempting to quantize the t5-small model using the t5-small topology. Despite making changes to hyperparameters such as `tune = True` and `save_strategy="epoch"`. I have already created...

emanbaeman

How to evaluate quantised model in Pytorch

1

Hi, The quantisation function in `neural_compressor/quantization/fit` returns a `PyTorchFXModel` object, which contains two members `fp32_model` and `model`. Could you please let me know what is the correct way of evaluating...

georgesterpu

neural-compressor
neural-compressor copied to clipboard

Metadata

Will you support Intel Arc?

[RFC] Porting INC SmoothQuant recipes to IPEX autotune API

How to parallelize a model with fake quantization nodes?

Bug in ORTSmoothQuant._adjust_weights()

[RFC] HuggingFace compabtible yet flexible WeightOnlyQuantization format for IPEX and INC

Is pruning of quantized models supported?

How to set the pruned weight blocks as a same learnable value?

support conv1d in quantization algorithms

Unable to save model in saved_results directory

How to evaluate quantised model in Pytorch

← Metadata

Owner

Metadata

neural-compressor neural-compressor copied to clipboard

Metadata

← Metadata

Owner

Metadata

neural-compressor
neural-compressor copied to clipboard