openvino icon indicating copy to clipboard operation
openvino copied to clipboard

[Snippets] Added support of INT8 models

Open a-sidorova opened this issue 2 years ago • 1 comments

Details:

  • Added support FQ for Snippets as FQ decomposition
  • Updated snippets_mark_skipped: Fixed ZeroPoints, fixed Reduce cases
  • Fixed ConvertSaturation and ConvertTruncation: added BWDCMP_RTTI_DEFINITION to avoid conflicts in MatcherPass
  • Removed fake BroadcastMove from Snippets for cases when last dimensions are equal
  • IMPORTANT: This PR is blocked by PR#12363. Load/Store emitters are very slow in ScalarTile cases

Tickets:

  • 77307

a-sidorova avatar Aug 02 '22 08:08 a-sidorova

@dmitry-gorokhov @IvanNovoselov I have applied your comments. Take a look please one more time

Also I'd like to notice that I have rewritten logic of Convert insertion and reset of TypeRelaxed nodes. Now we always insert ConvertSaturation before results (as before) and we have one common pass AlignDataType for all nodes to insert ConvertSaturation on inputs them if needed. It's made to cover other ops which can execute in non-FP32 precision in the future (for example, movement or MatMul).

a-sidorova avatar Sep 21 '22 07:09 a-sidorova