openvino icon indicating copy to clipboard operation
openvino copied to clipboard

[CPU] Align cpu execution order before/after ResolveComplexInplaceConflicts()

Open meiyang-intel opened this issue 1 year ago • 3 comments
trafficstars

Details:

  • Align cpu execution order before/after ResolveComplexInplaceConflicts()
  • Keep order information of Results and Parameters when dump CPU graph to ov::Model

Tickets:

  • CVS-134638

Description:

  • CPU execution order of some nodes may changes after https://github.com/openvinotoolkit/openvino/blob/2024.2.0.dev20240513/src/plugins/intel_cpu/src/graph.cpp#L285. Sometimes that may give ResolveComplexInplaceConflicts() incorrect execution order information. That may lead to ResolveComplexInplaceConflicts() get the wrong conclusion which edge memory should be shared. So this PR add SortTopologically() right before ResolveComplexInplaceConflicts() to let execution order not change much before/after ResolveComplexInplaceConflicts()*
  • The node order of CPU graph topology is not stable. For example in below graph image If Parameter0 is before than Parameter1 in graphNodes, in original SortTopologically(), it will first recurse node down from Parameter0. So in final sorted graphNodes, Parameter0 will be sorted after Parameter1. Then in second round of SortTopologically(), it will first recurse from Parameter1 and in final sorted graphNodes, Parameter0 will be sorted before Parameter0 again. This will make sometimes ReduceProd is executed before ScatterNDUpdate while sometimes ReduceProd is after ScatterNDUpdate. It will mislead ResolveComplexInplaceConflicts()

meiyang-intel avatar Jun 11 '24 02:06 meiyang-intel

Is it possible to cover the change with an SL/Unit test, e.g. use the pattern from the issue ticket?

maxnick avatar Jun 13 '24 08:06 maxnick

Is it possible to cover the change with an SL/Unit test, e.g. use the pattern from the issue ticket?

@maxnick , Unit test is added. I also find another issue in CPU SortTopologically(), and it is fixed in this PR. Please help review.

meiyang-intel avatar Jul 08 '24 09:07 meiyang-intel

@maxnick , all CI tests have passed.

meiyang-intel avatar Jul 11 '24 03:07 meiyang-intel

About the failed CI issue. Find that it is caused by NodeMemory. The detail is in https://jira.devtools.intel.com/browse/CVS-148497

meiyang-intel avatar Aug 01 '24 00:08 meiyang-intel