glow [GraphOptz] Sink Quantize below Reshape

Sink Quantize below Reshape to add the opportunity to merge with a Dequantize node and improve precision.

Jun 23 '20 13:06 mciprian13

When do you have this particular case ? I think this transformation is dangerous since it can move reshape between an input placeholder and quantize. Then, your fully quantized model becomes partially quantized (the reshape is now float) and if your backend expects a fully quantized model, it will fail compilation. In general, I think we should not sink anything below Quantize/Dequantize. When there is a quantize/Dequantize node somewhere, there is usually a good reason for it.

Jun 25 '20 16:06 tlepley-cadence

I think this transformation is dangerous since it can move reshape between an input placeholder and quantize. Then, your fully quantized model becomes partially quantized (the reshape is now float) and if your backend expects a fully quantized model, it will fail compilation.

A reshape is just a different view of memory/is generally a no-op for backends, so this doesn't really seem problematic to me -- what's your specific issue with this?

In general the Quantize being there implies whatever its input Node is is already float, and the presence of the Dequantize implies there are float Nodes following it already. You're suggesting that if the input is a Placeholder then no non-storage Nodes can be float, which is the property you're trying to sustain?

The main issue with this sort of opt IMO is that eliminating Quantize-Dequantize means we have changed the numerics of the model. Though this is often fine since it's moving from integer to floating point which is generally more accurate, and of course this improves performance by allowing us to skip a quantize+dequantize.

Jun 25 '20 18:06 jfix71

It's a no-op node, but our backend expects it to be quantized (because we requested the IR to be quantized). For a fully quantized IR, actual model inputs are the quantize nodes (the input image contains integer data, not float data) and if the reshape is be moved to float behind the Quantize boundary, there will be an inconsistency between the shape of the model input and the actual shape of input data.

In general, I think that unquantizing nodes in generic optims should be avoided, because if we have requested to quantize them, there is usually a good reason for it.

I'm curious of the use case discussed : Reshape between Quantize and Dequantize. In which situation does this happens ? Is it an actual use case ?

Jun 26 '20 15:06 tlepley-cadence

For a fully quantized IR, actual model inputs are the quantize nodes (the input image contains integer data, not float data) and if the reshape is be moved to float behind the Quantize boundary, there will be an inconsistency between the shape of the model input and the actual shape of input data.

I'm a little unclear of your use case here -- are you suggesting that your backend expects to see a float PH -> QuantizeNode, and then you provide int data even though the input PH was float? Shouldn't you be fusing the Quantize into the PH? Then we don't need to worry about this optimization, as long as it occurs after the float PH -> QuantizeNode to int8 PH fusion happens first.

Sep 24 '20 18:09 jfix71