finn
finn copied to clipboard
[Streamline] Remove operations when scaling factors cancel each other out
When exporting operations that use quantized tensors (floating point numbers with a scaling factor), the operations are split. First, a division block is created, which divides the input by the input scaling factor, resulting in a vector with integer numbers. Then the operation is implemented as an integer operation and before the data is passed back to the flow, the output scale factor is multiplied using a multiplication node. For streamlining, this means that more floating point operations are added, but if the input scale factor of one node matches the output scale factor of the previous node, these operations can cancel each other out. FINN requires a streamlining transformation that identifies such a pattern and removes the nodes from the graph accordingly.