openvino
openvino copied to clipboard
[CPU][FP16]Fix GatherCompressed with FP16 scale
Details:
- This PR https://github.com/openvinotoolkit/openvino/pull/23384 triggers the existing bugs in
GatherCompressedandFullyconnected - After converting all possible constants to FP16,
GatherCompresssedneeds to outputFP16when scale is changed to FP16 - After converting all possible constants to FP16,
FullyConnectedneeds to explicitly refuseFP16bias sinceoneDNNdoesn't supportFP16bias.
Tickets:
- CVS-140154
@yuxu42 @xipingyan @tiger100256-hu This PR could fix the problem introduced by https://github.com/openvinotoolkit/openvino/pull/23384 and GatherCompressed
@v-Golubev please also take a look
@v-Golubev @v-Golubev Do you have further comments ?
I suppose we need to validate PR changes. @zhangYiIntel can you please run internal performance validation on LLM models and provide results in the ticket? Thanks in advance
I suppose we need to validate PR changes. @zhangYiIntel can you please run internal performance validation on LLM models and provide results in the ticket? Thanks in advance
@v-Golubev Thanks for your advice, let me trigger a test. But in my mind, this change only fix a pattern matching bug, which has no side effect on current master since FP16 migration PR is not merged yet.
@zhangYiIntel Please rebase your changes once https://github.com/openvinotoolkit/openvino/pull/24117 is merged.