webml-polyfill icon indicating copy to clipboard operation
webml-polyfill copied to clipboard

[API] Support QUANTIZE and DEQUANTIZE operations

Open lisa0314 opened this issue 5 years ago • 2 comments

To run int8 model on DNNL, we need to add new operations - QUANTIZE and DEQUANTIZE

lisa0314 avatar Apr 15 '20 11:04 lisa0314

What's the usage of the two operators in int8 model? Can they be supported by pre-processing and post-processing JS code?

huningxin avatar Apr 27 '20 06:04 huningxin

Each op has a y_scale and the first layer of the int8 model has a x_scale which is used for quantizing the input of the model since the type of input of the model is f32. I suppose they could be supported by pre-process using JS code and the input type for our model is TENSOR_QUANT8_ASYMM_SIGNED instead of TENSOR_FLOAT32.

I suppose the quantizing op for input could be implemented either in JS or DNNL. If we decide to process quantizing of input in JS code, we could close this and other related issues. @huningxin Would you like to share your opinion?

lisa0314 avatar Apr 27 '20 06:04 lisa0314