model-optimization strange behavior when quantizing a model.

Hi all,

I was trying to quantize my model but something strange popped up.

I am using TensorFlow v2.14 and tfmot v0.7.5

I have a sub-classed tf.Keras.Model. It contains some custom layers and other standard layers such as concatenate, activation, etc..

I just want some specific layers to be quantized. For instance, I do not want concatenate to be quantized and 2 other layers. For that, I did not annotate them, so they are not an instance of QuantizeAnnotate.

But strangely, I see that concatenate is an instance of QuantizeWrapperV2. Although, I can also see it is not an instance of QuantizeAnnotate in my cloned function.

So, I do not understand why here : https://github.com/tensorflow/model-optimization/blob/e38d886935c9e2004f72522bf11573d43f46b383/tensorflow_model_optimization/python/core/quantization/keras/quantize.py#L384 we add to requires_output_quantize a layer that it is not an instance of QuantizeAnnotate rather then to check on isinstance as the name suggest? My layers that have not been annotated should be returned from here : https://github.com/tensorflow/model-optimization/blob/e38d886935c9e2004f72522bf11573d43f46b383/tensorflow_model_optimization/python/core/quantization/keras/quantize.py#L377 but during debug, I see that concatenate and other layers, that were not annotated, were added to requires_output_quantize. I believe, this should be wrong.

Now, if we go to this - https://github.com/tensorflow/model-optimization/blob/e38d886935c9e2004f72522bf11573d43f46b383/tensorflow_model_optimization/python/core/quantization/keras/quantize.py#L418C28-L418C52 we can see that if the layer is not in requires_output_quantize and not in layer_quantize_map then we will just retun the layer. But requires_output_quantize holds layers that do NOT need to be quantized from what I am seeing and what this : https://github.com/tensorflow/model-optimization/blob/e38d886935c9e2004f72522bf11573d43f46b383/tensorflow_model_optimization/python/core/quantization/keras/quantize.py#L384 is suggesting.

So I would expect layer.name not in requires_output_quantize to be removed in this if statement: https://github.com/tensorflow/model-optimization/blob/e38d886935c9e2004f72522bf11573d43f46b383/tensorflow_model_optimization/python/core/quantization/keras/quantize.py#L417 according to my analysis.

In conclusion, I would expect requires_output_quantize to be buggy otherwise there is an explanation for that.

I really appreciate if someone takes the time to explain this. Maybe I am wrong.

I look forward to your feedback.

Thanks, Idriss

Jan 23 '24 19:01 IdrissARM

@Xhark can you take a look at this?

Jan 26 '24 15:01 abattery

Any updates @Xhark, @abattery?

Feb 07 '24 11:02 IdrissARM