openvino [Snippets] Added support of INT8 models

[Snippets] Added support of INT8 models

Open a-sidorova opened this issue 2 years ago • 1 comments

Details:

Added support FQ for Snippets as FQ decomposition
Updated snippets_mark_skipped: Fixed ZeroPoints, fixed Reduce cases
Fixed ConvertSaturation and ConvertTruncation: added BWDCMP_RTTI_DEFINITION to avoid conflicts in MatcherPass
Removed fake BroadcastMove from Snippets for cases when last dimensions are equal
IMPORTANT: This PR is blocked by PR#12363. Load/Store emitters are very slow in ScalarTile cases

Tickets:

77307

Aug 02 '22 08:08 a-sidorova

@dmitry-gorokhov @IvanNovoselov I have applied your comments. Take a look please one more time

Also I'd like to notice that I have rewritten logic of Convert insertion and reset of TypeRelaxed nodes. Now we always insert ConvertSaturation before results (as before) and we have one common pass AlignDataType for all nodes to insert ConvertSaturation on inputs them if needed. It's made to cover other ops which can execute in non-FP32 precision in the future (for example, movement or MatMul).

Sep 21 '22 07:09 a-sidorova

openvino openvino copied to clipboard

[Snippets] Added support of INT8 models

Details:

Tickets:

openvino
openvino copied to clipboard