neural-compressor
neural-compressor copied to clipboard
New API ONNXRT example update
Type of Change
example
Description
update ONNXRT example for new API
JIRA ticket: ILITV-2468
How has this PR been tested?
extension test on onnx models
Dependency Change?
no
hi @chensuyue, PR is ready for extension test
@chensuyue extension test:
https://inteltf-jenk.sh.intel.com/job/intel-lpot-validation-top-mr-extension/3784/artifact/report.html

performance regression is caused by switching performance dataset from dummy to real dataset.
extension test for the other examples.
https://inteltf-jenk.sh.intel.com/job/intel-lpot-validation-top-mr-extension/3851/
https://inteltf-jenk.sh.intel.com/job/intel-lpot-validation-top-mr-extension/3877/ Note: object detection models need new quantization recipe support from Strategy team and may not pass extension test now.
https://inteltf-jenk.sh.intel.com/job/intel-lpot-validation-top-mr-extension/3877/ Note: object detection models need new quantization recipe support from Strategy team and may not pass extension test now.
NLP models failed due to some typos and code changes not working. Retest: https://inteltf-jenk.sh.intel.com/job/intel-lpot-validation-top-mr-extension/3883/
https://inteltf-jenk.sh.intel.com/job/intel-lpot-validation-top-mr-extension/3877/ Note: object detection models need new quantization recipe support from Strategy team and may not pass extension test now.
NLP models failed due to some typos and code changes not working. Retest: https://inteltf-jenk.sh.intel.com/job/intel-lpot-validation-top-mr-extension/3883/
Retest: https://inteltf-jenk.sh.intel.com/job/intel-lpot-validation-top-mr-extension/3890/ yolov3, yolov4 and tiny_yolov3 will not be enabled in this version because 'onnxrt.graph_optimization.level' is not supported now.
Retest: https://inteltf-jenk.sh.intel.com/job/intel-lpot-validation-top-mr-extension/3890/ yolov3, yolov4 and tiny_yolov3 will not be enabled in this version because 'onnxrt.graph_optimization.level' is not supported now.
- ssd-12, ssd-12_qdq, faster_rcnn, faster_rcnn_qdq, mask_rcnn, mask_rcnn_qdq will be re-enabled in 2.1 with supported 'onnxrt.graph_optimization.level' and quantization recipe. Please ignore them in extension test.
- hf model failed with error: 'setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (4,) + inhomogeneous part.', which is caused from numpy version update. issue
Update:
- remove ssd, faster_rcnn and mask_rcnn model
- update model config json
- add numpy==1.23.5 into requirements.txt in huggingface model
Retest: https://inteltf-jenk.sh.intel.com/job/intel-lpot-validation-top-mr-extension/3909/
passed: bert_squad_model_zoo_dynamic, mobilebert_squad_mlperf_dynamic, mobilebert_squad_mlperf_qdq, duc, BiDAF_dynamic and huggingface question answering models
failed: gpt2_lm_head_wikitext_model_zoo_dynamic and huggingface test classification models, retest: https://inteltf-jenk.sh.intel.com/job/intel-lpot-validation-top-mr-extension/3913/
passed: bert_squad_model_zoo_dynamic, mobilebert_squad_mlperf_dynamic, mobilebert_squad_mlperf_qdq, duc, BiDAF_dynamic and huggingface question answering models: https://inteltf-jenk.sh.intel.com/job/intel-lpot-validation-top-mr-extension/3908/artifact/report.html gpt2_lm_head_wikitext_model_zoo_dynamic and huggingface test classification models: https://inteltf-jenk.sh.intel.com/job/intel-lpot-validation-top-mr-extension/3919/artifact/report.html