openvino
openvino copied to clipboard
[CPU][FP16]Fix GatherCompressed with FP16 scale
Details:
-
This PR https://github.com/openvinotoolkit/openvino/pull/23384 triggers the existing bugs in
GatherCompressed
andFullyconnected
-
After converting all possible constants to FP16,
GatherCompresssed
needs to outputFP16
when scale is changed to FP16 -
After converting all possible constants to FP16,
FullyConnected
needs to explicitly refuseFP16
bias sinceoneDNN
doesn't supportFP16
bias.
Tickets:
- CVS-140154
@yuxu42 @xipingyan @tiger100256-hu This PR could fix the problem introduced by https://github.com/openvinotoolkit/openvino/pull/23384 and GatherCompressed
@v-Golubev please also take a look
@v-Golubev @v-Golubev Do you have further comments ?
I suppose we need to validate PR changes. @zhangYiIntel can you please run internal performance validation on LLM models and provide results in the ticket? Thanks in advance
I suppose we need to validate PR changes. @zhangYiIntel can you please run internal performance validation on LLM models and provide results in the ticket? Thanks in advance
@v-Golubev Thanks for your advice, let me trigger a test. But in my mind, this change only fix a pattern matching bug, which has no side effect on current master since FP16 migration PR is not merged yet.
@zhangYiIntel Please rebase your changes once https://github.com/openvinotoolkit/openvino/pull/24117 is merged.