PSyclone icon indicating copy to clipboard operation
PSyclone copied to clipboard

Improve collapsing support in loop transformations

Open LonelyCat124 opened this issue 4 years ago • 11 comments

The OMP collapse clause is not implemented for all directives and it can cause exceptions in transformations apply.

I believe the relevant transformations are OMPLoopTrans and OMPTaskloopTrans.

LonelyCat124 avatar Aug 06 '21 12:08 LonelyCat124

@LonelyCat124 Is there any more info about what directive/compiler versions this is not supported in? I tried !$omp do collapse with gcc 9.3 and it seems to work. I am thinking in re-enabling it for this directive.

sergisiso avatar Oct 18 '21 14:10 sergisiso

I didn't mean in compilers, I just meant in PSyclone (I didn't see an issue for the no implementation for the OMP DO Directive). I'm not sure about status in compilers - probably we should check gcc 7.4 (default ubuntu 18.04) and an intel release. If they support collapse we could re-enable it for most cases.

LonelyCat124 avatar Oct 18 '21 15:10 LonelyCat124

I think collapse has been around quite a long time (at least 6 or 7 years). We support it in OpenACC so should be able to generalise that.

arporter avatar Oct 18 '21 15:10 arporter

Oh, I see I've already done one review where you add this functionality @sergisiso! Too many Issues... :-)

arporter avatar Oct 18 '21 15:10 arporter

@arporter In this case I meant the TODOs #1370 in the src/psyclone/transformations.py

I see I can delete the OMPLoopTrans check and everything works, but I am not sure if it is the same with the taskloop one. @AidanChalk any idea if this is generally supported?

sergisiso avatar Oct 18 '21 15:10 sergisiso

I haven't implemented anything for taskloop, so probably I'd need to for that to remove this. I assume the parent class has no default but I don't remember.

LonelyCat124 avatar Oct 18 '21 19:10 LonelyCat124

Although the PSyIR backend could quite easily support this, we don't currently because it also requires that the gen code-generation pathway used by LFRic be updated.

arporter avatar Nov 04 '21 11:11 arporter

Collapse is supported by OpenMP and OpenACC. Currently we provide a collapse value to the transformation or directive, e.g: acc_loop.apply(child, {"collapse": 2})

There are 2 limitations:

  • The collapse value could be validated before generating the code (at the validate_global_constraints)
  • We could add the option to collapse as much as possible (we have code that figures this out in nemo utils.py - note that the can_loop_be_parallelised needs a only_nested_loops=False or change its default value)

@jwallwork23 has hit some of these issues.

sergisiso avatar Jul 13 '23 11:07 sergisiso

Can we close this issue and/or make new issues for the remaining limitations? I think we should maximise this collapse capability for NG-ARCH. We need to work out how #2573 will interact with this but having an enquiry function for "number of loops that can be collapsed" might be a useful function for a loop, especially with how it may interact with hoisting.

LonelyCat124 avatar May 22 '24 15:05 LonelyCat124

That seems sensible to me @LonelyCat124.

arporter avatar May 22 '24 15:05 arporter

If you don't mind I will keep this open. All the features are there but I still wanted to move the code from nemo utils to the transformation as part of this and let psyclone include a code comment with the reason it decided to stop collapsing. I do this soon.

sergisiso avatar May 23 '24 08:05 sergisiso