(#Closes #2592) Routine has a _symbol attribute.
@sergisiso @arporter I have a very initial implementation of the changes I feel are needed for the Routine node specifically to support this change. I'm a bit concerned about what I needed to do w.r.t to handling the creation of a Routine that doesn't have a parent and adding a parent in later (which is probably something reasonable to be able to do), as I needed to override _parent as both a property and setter.
If either you have time to have a look and let me know how you feel about this implementation before I move much further, and any suggestions/alternatives would be appreciated.
Thanks @LonelyCat124, I think what you're missing is use of the attach and detach mechanism. See https://psyclone-dev.readthedocs.io/en/latest/psyir.html#the-parent-child-relationship and possibly the section after it on dynamic updates.
@LonelyCat124 I had a look at this, I think overriding the _parent setter is a good possible alternative and it should not cause problems with the ChildrenList automatic parent handling. (the other alternative is using Container._update_signal but this will be executed any time there is ANY update below a container, which could quickly be very expensive, so I prefer to explore your _parent.setter). I belive you also need to delete the symbol when parent is disconnected (the _parent.setter is given a parent different than _parent_node, before discarding the old _parent_node, it needs something like: _parent_node.symbol_table.remove(self._symbol)?)
What I am not sure is about what you did inside the name.setter "if not self._symbol" branch. I think if the Scope already have a symbol with the given name it could be because another routine or import already has it, and should fail. Maybe for new Routines should just set self._symbol = RoutineSymbol("name", NoType()) in the __init__, before calling the supper constructor, would that work?
I think if the Scope already have a symbol with the given name it could be because another routine or import already has it.
I see that one problem with this is that the fparser2reader forward-declares routine symbols (with the _process_routine_symbols calls). Because Fortran can have a Call to a routine that is declared below itself. Maybe we could modify the frontend to put empty Routines in a first pass (of the caller of _process_routine_symbols), so the symbols exist, and then populate them on a second pass.
I've merged master in (and fixed the conflict) and I now get some failing tests that aren't clear as to why they're failling for me.
One such example is test_generate_container in fparser2_container_test.
The code seems to assume that the container generated from generate_container shouldn't have any children, wheras now it does (the subroutine it contains). Was there a reason this doesn't have a child on master? I guess creating the Routine is now adding it to the parent container, but if you use the full fparser chain it would already do that...
generate_container()'s days are numbered - #1115. Perhaps they've now run out?
I'm not sure - what was their original use-cases? If they're used in any scripts or any other part of PSyclone its not great to remove them, I can just fix the tests (there are also other failing tests I'm yet to examine).
Edit: grepping in PSyclone only shows the tests using it, so it depends on any other uses externally.
Edit: grepping in PSyclone only shows the tests using it, so it depends on any other uses externally.
In that case I would remove it. I'm not aware of any external uses of it. @JulienRemy do you use fparser2reader.generate_container()?
We've never used the attribute I implemented in the closed PR nor fparser2reader.generate_container(). No problem with doing this differently as far as I know!
Moving this into Routine looks to have some complex side effects w.r.t the DSL frontends.
One example is in GOcean where simply initialising a GOcean PSyIR tree is problematic. From what I understand, it looks like we first get a standard PSyIR tree, and then the raise_psyir_2_gocean_kern_trans takes standard PSyIR nodes and replaces them with the GOcean specific ones (e.g. GOceanKern).
The problem now is that since the Routine has added its' symbol to the GOceanContainer, there are also Call nodes to the RoutineSymbol, so when the transformation attempts to replace the Routine with the GOceanKern, the Routine attempts to remove its own symbol from the Container's symbol table, which fails in _validate_remove_routinesymbol.
I'm not really sure how to resolve this issue - any advice would be appreciated.
That does sound difficult. Could we perhaps overload the replace_with() method of Node for a Routine and do something special in there?
Yeah - I'm just not sure what the "special" thing should be. I can't decide if we should force removal of the symbol (somehow) in this case, but I'm not sure what the result of that would be, nor if we have that functionality.
I did wonder about not adding the symbol to the parent if its a GOceanContainer or an LFRicContainer (or perhaps if its not either specifically a Container or FileContainer with type(..) != comparisons), but I'm not sure if that would break lowering of Kerns to Routines, since I'm not sure how they're lowered. It appears like it might work, but I'm not sure when the children of a GOceanContainer are lowered based on its lower_to_language_level call, which doesn't appear to loop through its children and lower them.
There's also some issues with outputs it seems like due to copying the tree.
When copying the Container, it keeps the symbol in the new containers symbol table (i think? Its not clear that this always happens since not every test fails...), but for one of the inline trans tests (that i'm currently looking at) it fails here:
psyir/backend/visitor.py:182: in __call__
tree_copy = node.root.copy()
psyir/nodes/node.py:1602: in copy
new_instance._refine_copy(self)
psyir/nodes/scoping_node.py:102: in _refine_copy
super(ScopingNode, self)._refine_copy(other)
psyir/nodes/node.py:1585: in _refine_copy
self.children.extend([child.copy() for child in other.children])
psyir/nodes/node.py:261: in extend
self._set_parent_link(item)
psyir/nodes/node.py:185: in _set_parent_link
node._parent = self._node_reference
psyir/nodes/routine.py:194: in _parent
parent.symbol_table.add(self._symbol)
because sub is already in the SymbolTable, which I can only assume was due to copying in, but I'm not sure how I avoid this.
Edit: this happens at the fortran_writer(psyir) call.
TODO:
-
When replacing a Routine with a Routine - pass the symbol from the original routine to the new one (and adjust name accordingly?).
-
ScopingNode check for copied Routines and remove symbols corresponding to them.
I fixed the above things, just roaming through failing tests and fixing those that I understand that are caused by these changes.
I've hit one that I don't understand how I can have changed the behaviour of the code.
In frontend/fparser2_test.py there is test_deferred_array_size which now fails because:
KeyError: "Symbol table already contains a symbol with name 'n'.
I don't think I modified the behaviour of process_declarations so I'm confused how this passed before - fparser2.py adds the n symbol to the symbol table as an unresolved integer type, and then it fails when it comes across the actual symbol n's declaration.
@sergisiso @arporter Any ideas on what I might have messed up? I'll keep investigating other tests and update to master when I'm done and see if its still an issue.
Edit - will continue documenting tests i don't understand:
Also kernel_module_inline_trans_test.py::test_validate_unsupported_symbol_shadowing fails on "But it is fine if it shadows itself."
I think this change maybe straight up broke ModuleInlineTrans...
Also some tests in tests/psyir/tools/call_tree_utils_test.py are also failing, i don't know what this does so will need to discuss I expect.
Also having issues in test_lfric_adjoint test test_generate_lfric_adjoint_multi_routine as after copying I can't remove the old RoutineSymbol that I want to replace with the new one. Maybe the option here is to remove all the references to things before re-adding them but that seems tricky.
I also seem to have some failing tests in inline_trans because the order of Routines inside a Container has changed (and doesn't match the input code). I'll try to work out what can have changed to cause this first as it may fix a number of tests.
Ok it again looks like not an easy fix. The test is simple:
code = (
"module test_mod\n"
"contains\n"
" subroutine run_it()\n"
" integer :: i\n"
" real :: a(10)\n"
" do i=1,10\n"
" call sub(a, i)\n"
" end do\n"
" end subroutine run_it\n"
" subroutine sub(x, ivar)\n"
" real, intent(inout), dimension(10) :: x\n"
" integer, intent(in) :: ivar\n"
" integer :: i\n"
" do i = 1, 10\n"
" x(i) = 2.0*ivar\n"
" end do\n"
" end subroutine sub\n"
"end module test_mod\n")
During fparser, we create empty Routine for run_it and sub and add them to the test_mod Container. The implementation we decided on for filling them was we then add the internal ASTs, and then detach them (to be reconnected to their parent in process_nodes as is currently done).
Unfortunately for this example case this doesn't work as expected since run_it contains a call to sub, when calling routine.detach() for sub it returns a ValueError as its not safe to detach the node as there is a reference to its symbol.
An alternative solution to this would be to not detach Routine nodes as current and return None from the _subroutine_handler, however there is a bad case where a Routine should result in a CodeBlock, but its symbol is called elsewhere in the file, which I expect would mean the Routine fails to detach and adding a CodeBlock of this Routine would result in two implementations of the Routine in the tree, where one is potentially partly filled with the code and the other is a full implementation from a CodeBlock.
I wonder if I should force routine.detach() to work even if its symbol can't be removed, but I'm not sure what downsides this might have? I guess there is the "what if i'm attaching a routine to a container where its symbol already exists" issue, but I can check if its symbol is already present before trying to add it. Any thoughts @sergisiso?
Edit: This inability to remove Routines due to their symbols being prevented from removal also breaks _find_or_create_psyclone_internal_cmp at container.children.extend(dummymod.pop_all_children())
Remaining failing tests are in these files:
tests/psyir/backend/fortran_unsupported_declns_test.py
tests/psyir/frontend/fparser2_part_ref_test.py
tests/psyir/frontend/fparser2_test.py
tests/psyir/tools/call_tree_utils_test.py
tests/psyad/domain/lfric/test_lfric_adjoint.py
tests/domain/common/transformations/raise_psyir_2_alg_trans_test.py
tests/domain/common/transformations/kernel_module_inline_trans_test.py
tests/psyir/transformations/inline_trans_test.py
tests/psyir/frontend/fparser2_select_case_test.py
tests/domain/lfric/transformations/raise_psyir_2_lfric_kern_trans_test.py
tests/domain/lfric/transformations/raise_psyir_2_lfric_alg_trans_test.py
tests/domain/lfric/lfric_kern_test.py
I'd guesstimate that 2/3rds or more fail because Routine.detach() fails at various points as the RoutineSymbol is referenced.
I'm now forcing removal of symbols in ScopingNode._refine_copy, and keeping a list of "symbols_to_fix", but I'm not sure they ever actually need fixing as I think maybe something else is fixing the attachment of the Call nodes to the new Routine nodes, and since the new Routine nodes' symbols are added to the new symbol table its fixed itself.
I'm running the test suite with an assert False statement if I find a Call node whose symbol is in the list of symbols_to_fix, but otherwise I'll work with this for now.
It does have some un-pythonic/OO implementations in various points now, and once this is closer to review (i.e. the tests all work) then I probably need to discuss this with the reviewer probably.
The changes break the test for lines 404-410 in psyir/tools/call_tree_utils.py, I'm not quite sure I understand this code, but I think it becomes impossible to fail to find a (resolved) Routine in a module now so I'm not sure this error can happen? Would need @hiker to check that section I think. test_get_non_local_read_write_info_errors also fails - I assume for a similar reason but I'm not sure.
Still one test failing in fparser2.py. test_deferred_array_size is one, as previously discussed. I also broke some error code in _process_decln which is no longer reachable as the own_routine_symbol tag is no more in existence - I instead did a workaround by searching for the ancestor Routine and checking for return_symbol which seems to resolve those failing tests. @sergisiso or @arporter might be good to discuss and see if we can work out what has changed with the deferred array size declaration/how to fix it .
My remaining concern is that I broke something in adjoint @arporter. The test test_lfric_adjoint::test_generate_lfric_adjoint_multi_routine fails when outputting during symbol_table.deep_copy at this section:
for routine in symbol.routines:
new_routines.append((new_st.lookup(routine.symbol.name), routine.from_container))
symbol.routines = new_routines
What seems to be happening is there are some symbols that refer to a "routine" test_code_r4 which has (I assume) been adjointed into adj_test_code_r4, and so when it tries to lookup the test_code_r4 symbol that it needs to sort out the GenericInterfaceSymbol it fails because its not been copied into the new symbol table. Its not there because its no longer present in the old one now either. This all happens during _refine_copy of the fortran_writer - it might be best if we can have a discussion about this to work out whats going wrong/what I've changed that has broken expectations.
All the other remaining tests are around kernel_module_inline_trans - I've left those alone for now - I think they're all happening due to symbols already being present in symbol tables but I need to dive into why this happens for some (but not all) tests.
Edit: adjoint_visitor.py does node.copy() at line 437 which copies the GenericInterfaceSymbol containing the names of routines that will then be adjointed. Need to check what the correct intent here is and what the master branch currently outputs @arporter / me. Best option is maybe to run the lfric_adjoint_multi_routine test? Maybe relates to #1946 in some ways too but I'm not sure.
Ok, the KernelModuleInlineTrans bugs were super easy to fix (it was adding the symbol manually before adding the Routine as a child which of course was breaking). The only remaining issues are the ones in the previous post now. I will merge master and make this ready for a first review despite the last few failing tests.
There are accesses to some private members of objects in a few places right now, I couldn't find a good solution to these issues (since adding public accessors to them in general is not a good thing for the code).
One for @arporter or @sergisiso I think.
The changes break the test for lines 404-410 in
psyir/tools/call_tree_utils.py, I'm not quite sure I understand this code, but I think it becomes impossible to fail to find a (resolved) Routine in a module now so I'm not sure this error can happen? Would need @hiker to check that section I think.test_get_non_local_read_write_info_errorsalso fails - I assume for a similar reason but I'm not sure.
Sorry, noticed this a bit too late, I'll try to have a look tomorrow.
One issue (that I recently found) that might(!!), trigger that code is issue #2657(?)
Check return_symbol is setup.
The changes break the test for lines 404-410 in
psyir/tools/call_tree_utils.py, I'm not quite sure I understand this code, but I think it becomes impossible to fail to find a (resolved) Routine in a module now so I'm not sure this error can happen? Would need @hiker to check that section I think.test_get_non_local_read_write_info_errorsalso fails - I assume for a similar reason but I'm not sure.Sorry, noticed this a bit too late, I'll try to have a look tomorrow.
One issue (that I recently found) that might(!!), trigger that code is issue #2657(?)
@LonelyCat124 , the test originally removes the psyir for the surboutine from the container, but does not remove the symbol from the symbol table. Your code now removes the symbol, meaning instead of returning the name for a subroutine that does not exist anymore, it now returns an empty list (which is indeed better of course).
My feeling is (don't have enough time to dig into that) that #2657 could trigger the same code:
module my_api
use some_mod, only: my_proc
end module my_api
...
use my_api, only: my_proc
call my_proc()
When trying to get the PSyIR for my_proc, it will find the symbol, but no PSyIR.
My suggestion would be to either try the above code and replace this test, or put a #TODO 2657 with an explanation there (and remove the lines that print the warning in call_tree_utils), so when I (probably my task since I need it) get to 2657, I can re-evaluate this.
Thanks @hiker - I also looked through it with Andy and Sergi and we realised that while the Routine can't be there, there could be a CodeBlock with the RoutineSymbol still present, so the code is still reachable. I've adjusted the tests to simulate that case and we added additional error messages to handle the update to the old cases.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 99.86%. Comparing base (
652ad61) to head (ab65d7a). Report is 56 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #2596 +/- ##
=======================================
Coverage 99.86% 99.86%
=======================================
Files 354 354
Lines 49046 49112 +66
=======================================
+ Hits 48982 49048 +66
Misses 64 64
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@sergisiso I addressed the comments (some with changes, some with discussion) and merged Andy's symbol table deep copy changes into the master too.
@sergisiso Can pass back to you to have another look. I'm happy to adjust to have the code allow an attached RoutineSymbol as long as no Routine/CodeBlock exists in the scope with the same name if you think that is appropriate.
@sergisiso I implemented that change and I added a comment above showing how we could remove some of the "copying" of things - I think that implementation is worse due to how tags currently work, but maybe there's a better solution to tags somehow.
Edit: I think something in module_info may fail coverage now because its no longer possible to happen - if so I'll have to check we are ok to remove it or if it can be tested some other way.
@sergisiso I responded to the comments, I'll upstream the changes I've made when the tests have run - maybe the last 2 points are best to have a discussion once you're back from leave next week?
- [x] Remove tag from symbol.
- [x] Enable public setter (and accesor) for .symbol
- [x] Ensure that replace_with doesn't lose tag infomation.
- [x] Maybe add the "parent" setter instead to
set_parent_link?
@sergisiso test_named_invoke_name_clash in the test_psyGen.py is now failing and I'm not sure why/what is meant to do the symbol renaming here.
The code now generates:
SUBROUTINE invoke_a(invoke_a, b, istp, instead of SUBROUTINE invoke_a(invoke_a_1, b, istp,, but I'm not even sure what I'm generating or the code route since I've never really used these older parts of the code.