loki icon indicating copy to clipboard operation
loki copied to clipboard

SCC: Add vectorisation annotations in SCCRevector and translate in SCCAnnotate

Open mlange05 opened this issue 6 months ago • 3 comments

~Note: Just testing for now...~

This PR changes the way we add OpenACC annotations in the SCC pipelines and generally refactors SCCAnnotate and SCCRevector. The key change is that we now insert Loki-specific annotations for vector-loops, driver loops, routine annotations and vector reductions in SCCRevector and then let SCCAnnotate translate this into OpenACC directives. This is in preparation for the addition of an additional re-vectorisation scheme and the eventual support for OpenMP-offload in the SCC pipelines, as it will allow SCCAnnotate to easily switch between directive flavours.

There's also quite a bit of refactoring and clean-up, in particular in the upper control methods of SCCAnnotate, as well as the utility method check_routine_pragmas. Overall I hope this is now a bit cleaner and better structured for future development.

In more detail:

  • wrap_vector_section has been made a standalone routine (will be re-used in alternative re-vector scheme).
  • SCCRevector now annotates vector and sequential loops with respective !$loki loop pragmas, as well as marking kernel routines !%loki routine seq|vector and adding !$loki loop vector-reductions for vector loops in marked vector-reduction regions.
  • SCCRevector also re-uses some of these utilities when re-vectoring kernel loops; the respective routines have been refactored to work on generic node "sections" (tuple of Node).
  • SCCAnnotate converts !$loki loop and !$loki routine directives into !$acc equivalents and adds private clauses to vector loops and !$acc data present clauses to kernel routines.
  • The check_routine_pragmas utility has been renamed check_routine_sequential and no longer inserts pragmas at all. It merely checks if a routine has been marked with !$loki routine seq. The second use case (check for repeated processing) is now handled explicitly in SCCAnnotate when annotating kernel routines (which cannot happen twice!).
  • A general tidy-up of SCCAnnotate to make utility methods object-bound and re-use common ones in the driver processing code path. We still need the block_dim, but the horizontal is no longer needed in SCCAnnotate.
  • test_scc_vector.py now also checks for the insertion of certain !$loki loop pragmas.

mlange05 avatar Aug 09 '24 11:08 mlange05