llvm-project icon indicating copy to clipboard operation
llvm-project copied to clipboard

Implement reductions in parallel loops (easy starter)

Open aartbik opened this issue 4 years ago • 5 comments

Bugzilla Link 52312
Version unspecified
OS Linux
CC @joker-eph

Extended Description

Support parallel reductions. Currently only vector loops support reductions, but forall loops (viz. scf::ParallelOp ) can deal with reductions too.

Relevant entry point: isParallelFor() method in https://github.com/llvm/llvm-project/blob/main/mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp

aartbik avatar Oct 26 '21 00:10 aartbik

assigned to @aartbik

aartbik avatar Oct 26 '21 00:10 aartbik

I am interested in working on this issue. After having a look at the isParallelFor() method, I think we can enable parallel reduction by simply removing !isReduction from return statements inside each switch case. We should also modify this condition to permit parallelization of sparse output using reduction. All other things will probably be taken care of by the existing infrastructure.

To summarise, I am proposing the following isParallelFor() method :

static bool isParallelFor(CodeGen &codegen, bool isOuter, bool isReduction,
                          bool isSparse, bool isVector) {
  if (codegen.sparseOut && !isReduction)
    return false;
  switch (codegen.options.parallelizationStrategy) {
  case SparseParallelizationStrategy::kNone:
    return false;
  case SparseParallelizationStrategy::kDenseOuterLoop:
    return isOuter && !isSparse && !isVector;
  case SparseParallelizationStrategy::kAnyStorageOuterLoop:
    return isOuter && !isVector;
  case SparseParallelizationStrategy::kDenseAnyLoop:
    return !isSparse && !isVector;
  case SparseParallelizationStrategy::kAnyStorageAnyLoop:
    return !isVector;
  }
  llvm_unreachable("unexpected parallelization strategy");
}

Am I on the right track? @aartbik

On a separate note, what is the preferred way of checking the parallel execution strategy of generated for loop (so that I can verify if my solution works)? I am hesitant on checking the manually generated IR since I am not much familiar with OpenMP dialect.

meshtag avatar Mar 10 '22 21:03 meshtag

Removing these flags is of course the first step, but the existing infrastructure will not just simply take care of the rest, since you will end up with a parallel loop construct that has a loop-carried dependence. Please have a look at the ParallelOp in the SCF dialect, in particular the scf.reduce/return constructs to see what else is required.

aartbik avatar Mar 21 '22 18:03 aartbik

I can try to take a look at it.

PeimingLiu avatar Jun 07 '22 15:06 PeimingLiu

Checkout https://reviews.llvm.org/D135927

PeimingLiu avatar Oct 14 '22 00:10 PeimingLiu

Completed by Peiming.

aartbik avatar Nov 10 '22 05:11 aartbik