model-optimization Support `float16` input for `tf.quantization.fake_quant

System information

TensorFlow version (you are using): 2.6, nightly
Are you willing to contribute it (Yes/No): Yes

Describe the feature and the current behavior/state.

Currently tf.quantization.fake_quant_* ops do not support float16 input. E.g.:

x = tf.random.uniform((32, 3, 3, 32), dtype=tf.float16)
tf.quantization.fake_quant_with_min_max_vars(x, -6.0, 6.0)

would fail because there is no kernel implementation available for float16 input on either GPU or CPU.

Checkout this notebook for a full reproduction.

Will this change the current api? How?

This won't change the API.

Who will benefit with this feature?

The lack of float16 support in tf.quantization.fake_quant_* ops prevents people doing training aware quantisation from using Keras mixed precision training. This means many performance optimisations are inaccessible for people requiring quantisation aware training. In our specific case it means that training with tf.quantization.fake_quant_* ops in the graph is twice as slow as without, due to the need of casting activations back to float32.

Any Other info.

I think it should be fairly straight forward to add float16 support to the fake quant functor as it is a native Eigen implementation.

Sep 06 '21 16:09 lgeiger

@abattery Thanks for taking a look! I am not sure why this issue was transferred to TF-MOT though, since it is not directly related to TF-MOT. The issue lies not in the Python code of this repo, but in the lacking support of fp16 in the core fake quantisation op of TensorFlow (in fact I am personally not even using TF-MOT for the training aware quantisation I was referring to above).

Sep 07 '21 23:09 lgeiger

Hi @Xhark , do you think it should be handled by MOT team or TF core team?

Sep 27 '21 07:09 teijeong

Support `float16` input for `tf.quantization.fake_quant_*` ops