openvino
openvino copied to clipboard
[Snippets] Added support of INT8 models
Details:
- Added support FQ for Snippets as FQ decomposition
-
Updated
snippets_mark_skipped
: Fixed ZeroPoints, fixed Reduce cases -
Fixed
ConvertSaturation
andConvertTruncation
: addedBWDCMP_RTTI_DEFINITION
to avoid conflicts inMatcherPass
-
Removed fake
BroadcastMove
from Snippets for cases when last dimensions are equal -
IMPORTANT: This PR is blocked by PR#12363. Load/Store emitters are very slow in
ScalarTile
cases
Tickets:
- 77307
@dmitry-gorokhov @IvanNovoselov I have applied your comments. Take a look please one more time
Also I'd like to notice that I have rewritten logic of Convert
insertion and reset of TypeRelaxed
nodes. Now we always insert ConvertSaturation
before results (as before) and we have one common pass AlignDataType
for all nodes to insert ConvertSaturation
on inputs them if needed. It's made to cover other ops which can execute in non-FP32 precision in the future (for example, movement or MatMul
).