PSyclone icon indicating copy to clipboard operation
PSyclone copied to clipboard

Transform intrinsic reductions with implicit ranges to explicit loops (for Nemo OpenMP offloading)

Open sergisiso opened this issue 2 years ago • 0 comments

A bit of NEMO that is currently not executed on the GPU with the OpenMP transformation are expressions like: zmax(2) = MAXVAL(ABS(un(:, :, :))) zmax(3) = MAXVAL(- tsn(:, :, :, jp_sal), mask = tmask(:, :, :) == 1._wp)

In OpenACC the kernels directive understands the intrinsic and in OpenMP the "workshare" directive should do it as well but it is not supported most OpenMP implementations.

The alternative is to convert the expression to its explicit loop reduction form. The current range2loop transformation only looks for statements with ranges in the rhs, so this will be probably something different. Also we may need to as identify the reduction loops and add the appropriate OpenMP clause.

So the steps to improve this situation may be:

  • [x] MAXVAL, MAXLOC, MINLOC with mask argument supported. (@rupertford )
  • [x] Implicit range reduction to explicit loops transformation
  • [ ] Add OpenMP pragmas with Reduction clause when needed.

sergisiso avatar Mar 16 '22 15:03 sergisiso