openvino icon indicating copy to clipboard operation
openvino copied to clipboard

[CPU][FP16]Fix GatherCompressed with FP16 scale

Open zhangYiIntel opened this issue 9 months ago • 3 comments

Details:

  • This PR https://github.com/openvinotoolkit/openvino/pull/23384 triggers the existing bugs in GatherCompressed and Fullyconnected
  • After converting all possible constants to FP16, GatherCompresssed needs to output FP16 when scale is changed to FP16
  • After converting all possible constants to FP16, FullyConnected needs to explicitly refuse FP16 bias since oneDNN doesn't support FP16 bias.

Tickets:

  • CVS-140154

zhangYiIntel avatar Apr 30 '24 07:04 zhangYiIntel

@yuxu42 @xipingyan @tiger100256-hu This PR could fix the problem introduced by https://github.com/openvinotoolkit/openvino/pull/23384 and GatherCompressed

zhangYiIntel avatar May 06 '24 00:05 zhangYiIntel

@v-Golubev please also take a look

dmitry-gorokhov avatar May 09 '24 09:05 dmitry-gorokhov

@v-Golubev @v-Golubev Do you have further comments ?

zhangYiIntel avatar May 13 '24 08:05 zhangYiIntel

I suppose we need to validate PR changes. @zhangYiIntel can you please run internal performance validation on LLM models and provide results in the ticket? Thanks in advance

v-Golubev avatar May 16 '24 09:05 v-Golubev

I suppose we need to validate PR changes. @zhangYiIntel can you please run internal performance validation on LLM models and provide results in the ticket? Thanks in advance

@v-Golubev Thanks for your advice, let me trigger a test. But in my mind, this change only fix a pattern matching bug, which has no side effect on current master since FP16 migration PR is not merged yet.

zhangYiIntel avatar May 17 '24 01:05 zhangYiIntel

@zhangYiIntel Please rebase your changes once https://github.com/openvinotoolkit/openvino/pull/24117 is merged.

dmitry-gorokhov avatar May 20 '24 08:05 dmitry-gorokhov