optimum icon indicating copy to clipboard operation
optimum copied to clipboard

Modify Parallelization Strategy to Make it More General

Open zhenglongjiepheonix opened this issue 6 months ago • 1 comments

As per title, this PR tries a more general approach rather than relying purely on human heuristics, basically it uses the following steps to search a possible parallelization strategy for a transformer model

  • Use dynamo for graph tracing so that we get the graph to operate on
  • Decompose and functionalize the traced graph so that we get a smaller op set to work with
  • Apply parallel axis analysis and do a constrained backtracking search on the whole graph to get a possible solution(not necessarily optimal)
  • Replace ops the original traced graph with their parallelized version(Linear -> ColumnLinear/RowLinear)

And for the API design, we disable the support of passing custom modules and only focus on models in transformers because supporting custom models is not the priority for now.

zhenglongjiepheonix avatar Aug 14 '24 01:08 zhenglongjiepheonix