llvm-project icon indicating copy to clipboard operation
llvm-project copied to clipboard

[MLIR][OpenMP] Support lowering of host_eval to LLVM IR

Open skatrak opened this issue 1 year ago • 4 comments

This patch updates the MLIR to LLVM IR lowering of omp.target to support passing num_teams, num_threads, thread_limit and SPMD loop bounds through the host_eval argument of omp.target.

This replaces the previous implementation where this information was directly attached to the omp.target operation rather than captured to be used by the corresponding nested operation.

The implementation of TargetOp::getInnermostCapturedOmpOp is also improved to address bugs in the detection of target SPMD in MLIR, uncovered by changes to the translation to LLVM IR.

skatrak avatar Oct 10 '24 15:10 skatrak

PR stack:

  • #178
  • #179
  • #180

skatrak avatar Oct 10 '24 15:10 skatrak

OpenMPIRBuilder changes LGTM

No test added?

Thank you @Meinersbur for taking a look. I added a test that is basically copied from an existing one, since that one used to check all corner cases related to the trip count calculation. I don't know if it makes sense to keep both as they are, but let me know what you think.

skatrak avatar Oct 29 '24 15:10 skatrak

What I had in mind was a test where getHostEvalVars() is non-empty and thus the code-generation changes, or is first possible without crashing.

calculateCanonicalLoopTripCount was already tested indirectly through createCanonicalLoop. These are unit-tests, and since calculateCanonicalLoopTripCount is not available as its own "unit", I think you can remove the equivalent ones for createCanonicalLoop.

Meinersbur avatar Oct 29 '24 16:10 Meinersbur

What I had in mind was a test where getHostEvalVars() is non-empty and thus the code-generation changes, or is first possible without crashing.

Ah, that makes sense. Sorry for the misunderstanding! I just added an MLIR to LLVM IR translation test for these host evaluated clauses. I thought you were referring to the OMPIRBuilder refactoring because that's what you said you had reviewed 😅.

calculateCanonicalLoopTripCount was already tested indirectly through createCanonicalLoop. These are unit-tests, and since calculateCanonicalLoopTripCount is not available as its own "unit", I think you can remove the equivalent ones for createCanonicalLoop.

What I just did was to remove the CanonicalLoopBounds test and replace it with the CanonicalLoopTripCount test that does the same checks of the trip count, but without creating and looking at the canonical loop structure (which is already done by CanonicalLoopSimple above). Let me know if that's not what you meant, so that I can update it again.

skatrak avatar Oct 30 '24 12:10 skatrak