onnxruntime
onnxruntime copied to clipboard
[Quantization] Fix static quantize runner usage.
Description
- Input pb files were read in incorrect order.
- Cause: Python
sortedwas used to acquire sorted input files in order. However, the input files would be in incorrect order since "10" is lexicographically smaller than "2". - Fix: Revise to enumerating indices to read input files.
- Cause: Python
- CumSum's output wasn't quantized.
- Cause: CumSum wasn't registered into QDQ registry.
- Fix: Register CumSum with QDQDirect8bitOp.
Motivation and Context
Fix two issues in static_quantize_runner usage.