neural-compressor
neural-compressor copied to clipboard
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
The imput model format is onnx, and the framework onnxruntime can not detect the model format.
### Hi Team, thanks for the wonderful tool. I am trying to quantize savedmodel with multiple inputs but facing below issue. Code: ``` import tensorflow as tf from neural_compressor.experimental import...
I'm following the TensorFlow [BERT MRPC example](https://github.com/intel/neural-compressor/blob/master/examples/tensorflow/nlp/bert_base_mrpc/run_classifier.py) to run the neural compressor with a saved model that I exported after [fine tuning BERT from the Intel Model Zoo](https://github.com/IntelAI/models/tree/master/benchmarks/language_modeling/tensorflow/bert_large/training/fp32) using the...
Recently, I quantized a pre-trained ResNet50 model from fp32 to int8, and I noticed that the performance isn't what I expected. The performance is only about 2x compared to the...
When I use tf2.5 with NIC, I want to use tensorboard, and my conf is ``` version: 1.0 model: # mandatory. used to specify model specific information. name: origin_model framework:...
I am try to run the deeplab segmentation model in Directory “neural-compressor/examples/tensorflow/semantic_image_segmentation/deeplab/quantization/ptq/main.py” Data is preferred correctly, which is in form of .tfrecords. But after preferring steps when I am trying...
Bumps [joblib](https://github.com/joblib/joblib) from 1.1.0 to 1.2.0. Changelog Sourced from joblib's changelog. Release 1.2.0 Fix a security issue where eval(pre_dispatch) could potentially run arbitrary code. Now only basic numerics are supported....
Bumps [node-fetch](https://github.com/node-fetch/node-fetch) to 2.6.7 and updates ancestor dependency [react](https://github.com/facebook/react/tree/HEAD/packages/react). These dependencies need to be updated together. Updates `node-fetch` from 1.7.3 to 2.6.7 Release notes Sourced from node-fetch's releases. v2.6.7 Security...
When run the demo case in `README`, I met following issues: ### UNIMPLEMENTED: DNN library is not found:  ### Please install Intel® Optimizations for TensorFlow or MKL enabled TensorFlow...
One of the example files of the Neural Compressor project uses the following approach to fetch all the available backbones of TorchVision: https://github.com/intel/neural-compressor/blob/3482e789e1d26967c448ca53a6bba8714f75c8f2/examples/pytorch/image_recognition/torchvision_models/distillation/eager/main.py#L11-L13 The above approach will return all backbones...