PSyclone
PSyclone copied to clipboard
(closes #310) add support for acc enter data and acc update to NEMO API
The (NEMO)ACCEnterDataDirective works with a list of strings and is therefore easy to make work with structures and deep copies. However, ACCUpdateDirective expects a list of Symbols and it's not possible to get this to work for structure accesses. I could change it to expect a list of Signatures from which we could construct suitable strings. Ideally, we would actually construct References and have these as children of the Directive node (and thus do away with the begin_string
method) but that's a step beyond where we are now.
Codecov Report
Base: 99.85% // Head: 99.85% // Increases project coverage by +0.00%
:tada:
Coverage data is based on head (
ae6aa0c
) compared to base (1298d79
). Patch coverage: 100.00% of modified lines in pull request are covered.
Additional details and impacted files
@@ Coverage Diff @@
## master #1554 +/- ##
========================================
Coverage 99.85% 99.85%
========================================
Files 285 286 +1
Lines 40801 40960 +159
========================================
+ Hits 40740 40900 +160
+ Misses 61 60 -1
Impacted Files | Coverage Δ | |
---|---|---|
src/psyclone/psyir/tools/dependency_tools.py | 100.00% <ø> (ø) |
|
src/psyclone/gocean1p0.py | 99.86% <100.00%> (-0.01%) |
:arrow_down: |
src/psyclone/nemo.py | 100.00% <100.00%> (ø) |
|
src/psyclone/psyir/nodes/acc_directives.py | 100.00% <100.00%> (ø) |
|
src/psyclone/psyir/transformations/__init__.py | 100.00% <100.00%> (ø) |
|
...psyclone/psyir/transformations/acc_update_trans.py | 100.00% <100.00%> (ø) |
|
src/psyclone/transformations.py | 99.37% <100.00%> (+0.16%) |
:arrow_up: |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
I've extended what I have to put local variables in a create
clause. However, this still fails at runtime, probably because it deals in reference counts whereas essentially we need a new array every time we enter the routine.
Actually, it turns out I wasn't re-compiling the right file! I need to redo and check.
I had to add an exit data delete(...)
and then it worked (where ...
are the local variables that are in the create
clause).
I've quite radically reworked ACCUpdateDirective
. The constructor no longer takes a list of signatures, but rather a list of nodes from which a new method in the class, lower_to_language_level()
, is responsible to extract the required signatures. I first tried to assign this list of nodes as the children of the directive, but as it inherits from StandaloneDirective()
, which can't have any children, it didn't quite work... thus the new argument nodes
. The tests haven't been updated yet and so are currently broken.
Do we have to keep children
as a constructor argument even for classes which disallow them?
Hi @nmnobre, yes, all node constructors have to take children
. In this case the children will need to be Clause
nodes, in a similar way to what I am doing in #1396. This may be better done in a create()
method rather than through the constructor.
Hi @nmnobre, yes, all node constructors have to take
children
. In this case the children will need to beClause
nodes, in a similar way to what I am doing in #1396. This may be better done in acreate()
method rather than through the constructor.
Ah yes, that's right. We are changing the way we think about these. Even standalone directives will have children now!... I take it you're happy I proceed with the directive rework then?
In hindsight, I think I made the wrong decision...
I tried to make the update
directive implementation mimic that of enter data
, but I should've done the opposite...
We shouldn't assume enter data
will always take all the data in parallel
and kernels
regions within the routine. The decision of what data the directive takes should belong to the transformation class which is context aware.... It's the enter data
directive which needs changing to accept a list of signatures. Back to the drawing board.
We shouldn't assume enter data will always take all the data in parallel and kernels regions within the routine.
What's your thinking here? My assumption was that we should assume that as we know it's going to be needed.
What's your thinking here? My assumption was that we should assume that as we know it's going to be needed.
I agree, it's eventually going to be needed, but might we not be too eager? In the sense a CPU region might precede the first GPU usage.
@arporter OpenACC enter data directives are now placed just before the first compute construct, causing some tests to fail. What's the right way to address this? Should we change the test or return the directive to its original position for some of the APIs?
OpenACC enter data directives are now placed just before the first compute construct, causing some tests to fail. What's the right way to address this? Should we change the test or return the directive to its original position for some of the APIs?
I don't mind. If the new location is valid and it's easiest to do it that way then I'm happy for the tests to be modified.
I've tried tidying the code a bit but:
- If there's an
acc_directives
module, shouldn't we switch to use anacc_transformations
module too? - If the variables for the
enter data
directive are collected in the directive's code, why are the variables for theupdate
directive collected in the transformation's code? - I'd like to make the NEMO API, "THE OTHER GENERIC" API, by dropping NEMO specific conditionals and making Dynamo and GOcean the exceptions. What's the point of adding if conditionals specific to NEMO when they don't really depend on NEMO, but rather on not being Dynamo or GOcean... The few configuration settings we are currently using I'm guessing are most useful for implicit loops, but that's not a NEMO feature, it's a Fortran feature, we just have to define a configuration file format that any app could use....
- Without proper dependency analysis, I don't think we can do much better with the placement of the
update
directive. OpenACC compute constructs and subroutine calls should be treated differently from CodeBlocks, the latter are (always?) for the CPU whereas the former are (possibly) for the GPU...
Thanks @nmnobre. re. your first point, we're still in a state of flux with regards to where transformations live. We used to have everything in transformations.py
, then we had stuff under nemo/transformations
and now, with the move to PSyIR, we have them under either domain/xxx/transformations
or psyir/transformations
depending on whether or not they are API specific.
If we're introducing a new transformation here (I can't remember!) then we should put that in the appropriate location. If not then our practise is to only move them in PRs that are restricted to just that change (otherwise it's hard to review).
re. your second point, that sounds like an oversight on my part. In order to be able to reason about the tree we're trying to ensure that Nodes hold all necessary information rather than delaying to code-generation time (as we used to). Since the list of variables may need to be updated in light of subsequent transformations we probably want the functionality that generates it to be a part of the Node itself. Does @sergisiso agree?
re. your third point, that sounds like something that needs further discussion. If we've got a lot of conditionals then probably we need to restructure some classes as things that are domain specific should be in domain specific classes (ideally). Finally, your fourth point also needs more discussion for me to understand. I think we've got sufficient dependence analysis (with the exception of CodeBlocks).
Without proper dependency analysis, I don't think we can do much better with the placement of the update directive. OpenACC compute constructs and subroutine calls should be treated differently from CodeBlocks, the latter are (always?) for the CPU whereas the former are (possibly) for the GPU...
Can you list what would you need the dependency analysis to do? Maybe with some examples? For now we have focused on inside-loop analysis but @hiker is looking at dep. analysis for loop fusing which may have some common functionality with what you need.
re. your third point, ...
I can't comment on the if-statements, but I think it would be great to rename NEMO to ... something else. We might appear to people interested in PSyclone to not really support generic codes, only NEMO. We could provide different config files. e.g. psyclone-nemo.cfg, psyclone-roms.cfg, ... and a generic one (which might just be the nemo one :) ).
We could call it .... 'evolution' :)
Time for a separate ticket for that?
Hi @arporter, as discussed, this should now be ready for a first look 😊
I've also just realised that I'm the original author of this PR so GitHub isn't going to let me approve it. I suggest responding to all of my current comments and once that's done we can create a new PR.
Alternatively, @arporter when you finish your review I can do a quick read and finish the approval myself. I would like to be aware of what is being done regarding data directives anyway so it would be useful for me.
Hi @nmnobre, is this ready for another look now?
Hi @nmnobre, is this ready for another look now?
Hmmm.... I haven't really addressed all of your comments yet... I don't know if you'd rather see it all at once or if you're happy to close the conversations I've addressed....
Unless there are things with wide-ranging effects, I'd suggest doing the whole thing so we (I) don't get confused :-)
Having written that, I realise that we also need to document what's being done as it is significant functionality. I think the place to do this is the OpenACC section of the Developer Guide (https://psyclone-dev.readthedocs.io/en/latest/transformations.html#openacc). Please could you describe the strategy used - i.e. the combination of Enter Data and Update and the algorithm used to place them. That will then be a big help to me in reviewing the code too.
Done?
I feel this is now ready for a second look @arporter !
Ready for another look @arporter !
Please could you fix/disable (as appropriate) the new pylint warnings in acc_update_trans:
psyir/transformations/acc_update_trans.py:382:12: W0621: Redefining name 'ACCUpdateDirective' from outer scope (line 45) (redefined-outer-name) psyir/transformations/acc_update_trans.py:45:0: W0611: Unused ACCUpdateDirective imported from psyclone.psyir.nodes (unused-import)
Done.
Ready for another look.
Nice. I'm happy now. Just a couple of comments to improve and then please request a review from Sergi. (Tests and examples are all fine. Ref. guide builds fine.) I've taken the liberty of bring the branch up to master and also fixing a reference to #310 that I found in one of the tests. Therefore you'll need to pull before you make any changes.
Ready for yet another look. :) Could you please also make sure all resolved conversations are marked as such?
Link check is failing for the Yaxt repo. but that's not the fault of this PR. Examples and tests are all good with compilation. This is now ready for @sergisiso to take a look.