PSyclone
PSyclone copied to clipboard
Issues with InliningTrans (with ArraryRef to StructureofArrayRef conversion?)
Trying to inline the call in:
DO j = ssha_t%internal%ystart, ssha_t%internal%ystop, 1
DO i = ssha_t%internal%xstart, ssha_t%internal%xstop, 1
CALL continuity_code(i, j, ssha_t%data, sshn_t%data, sshn_u%data, sshn_v%data, hu%data, hv%data, un%data, vn%data, &
&sshn_t%grid%area_t, rdt)
END DO
END DO
which references the subroutine:
SUBROUTINE continuity_code(ji, jj, ssha, sshn, sshn_u, sshn_v, hu, hv, un, vn, e12t, rdt)
...
rtmp1 = (sshn_u(ji,jj) + hu(ji,jj)) * un(ji,jj)
...
Ends up with:
DO j = ssha_t%internal%ystart, ssha_t%internal%ystop, 1
DO i = ssha_t%internal%xstart, ssha_t%internal%xstop, 1
rtmp1 = (sshn_u%data + hu%data) * un%data
rtmp2 = (sshn_u%data + hu%data) * un%data
rtmp3 = (sshn_v%data + hv%data) * vn%data
rtmp4 = (sshn_v%data + hv%data) * vn%data
ssha_t%data = sshn_t%data + (rtmp2 - rtmp1 + rtmp4 - rtmp3) * rdt / sshn_t%grid%area_t
END DO
END DO
which misses the array access expressions.
The loc which is causing the issue seems to be here:
ref.replace_with(node.children[dummy_args.index(ref.symbol)].copy())
The example I'm looking at (which is essentially the same as Sergi's) is taking a <class 'psyclone.psyir.nodes.array_reference.ArrayReference'>
and replacing it with <class 'psyclone.psyir.nodes.structure_reference.StructureReference'>
.
Maybe if the original reference is an ArrayReference
this code needs to do something else, but I'm not quite clear what the original code does? @arporter is there an easy alternative to this?
Yes, well spotted. This is the issue. It does a pretty simple substitution without looking if ref
has a subtree which currently gets lost.
If it has a subtree (due to array accessors or structure accessors) it has to merge the outside expression (argument in the call) to the inside expression (subroutine body), which is not as simple as to copy the children because the top-level node type may change.
I have a test implementation on my own branch, but I'm not sure if there's just a better way that doesn't require branchign:
if isinstance(ref, ArrayMixin) and not isinstance(node.children[dummy_args.index(ref.symbol)], ArrayMixin):
if isinstance(node.children[dummy_args.index(ref.symbol)], StructureReference):
symbol = node.children[dummy_args.index(ref.symbol)].symbol
members = []
members.append(node.children[dummy_args.index(ref.symbol)].member)
childmember = node.children[dummy_args.index(ref.symbol)].member
while isinstance(childmember, StructureMember):
childmember = childmember.member
members.append(childmember)
final_member = members[-1]
indices = []
for index in ref.indices:
indices.append(index.copy())
array_member = ArrayMember.create(final_member.name, indices)
members[-1] = array_member
replacement = StructureReference(symbol)
for member in members:
replacement.addchild(member)
ref.replace_with(replacement)
elif isinstance(node.children[dummy_args.index(ref.symbol)], Reference):
symbol = node.children[dummy_args.index(ref.symbol)].symbol
indices = []
for index in ref.indices:
indices.append(index.copy())
replacement = ArrayReference.create(symbol, indices)
ref.replace_with(replacement)
else:
ref.replace_with(
node.children[dummy_args.index(ref.symbol)].copy())
I'm also not sure if there might be other things that need solving?
I'm also not sure why the members don't need to be copied for that code to work so I added that in.
I'm bit late to the party (apologies) and so can't see what you had before but the copy
is recursive so should do the whole shebang for a structure reference. The tricky bit will be making sure that the ultimate member of the structure reference is now of the correct type - it may well have to change e.g. ArrayMember instead of Member.
I suppose that's the same problem as a Reference
having to become an ArrayReference
.
We could also have:
call my_sub(my_struc)
subroutine my_sub(arg)
type(my_type) :: arg
do ji = 1, n
arg%data(ji) = 0
where the original argument to the call will be a Reference but the in-lined code will obviously have a StructureReference.
I found another flaw with my implementation. When i do this loop:
indices = []
for index in ref.indices:
indices.append(index.copy())
The index references I'm copying are to the symbols from the call that is being inlined, and not the new symbols, which is causing issues in my code.
I tested moving this section into a new function and changed the above code fragment to this (without the assertion):
for index in ref.indices:
if isinstance(index, Reference):
self.replace_dummy_arg(index, node, dummy_args)
else:
assert False
for index in ref.indices:
indices.append(index.copy())
It works for indices that are References - the difficulty is when you have something like a(b+1)
it will fail to behave correctly. Probably I want to do ref.walk(Reference) and call replace_dummy_arg
on that instead? I will try.
Still not looked at structure references - if I get it working for arrays correctly I will push it to my tasking branch and link to the temporary implementation here.
112f9f0 contains only what I changed to get this working for my current use-case. If anyone picks this up then it can be a starting point (though I think it is certainly not optimal)
There is a bit of weirdness (which could maybe be avoided and thus is a better implementation) in line 174-178:
for ref in refs:
if ref.parent is not None:
self.replace_dummy_arg(ref, node, dummy_args)
The if statement seemed to be needed as if I found an array access, I'd already have replaced the reference in the tree with a new reference, pointing to the new symbol anyway. I think the better way to do this would be to do:
for ref in refs.reverse():
instead, but I'm not 100% sure it avoids all the issues. With that change I'd hope you could also remove lines 223-225.
Since this is my bug I'm happy to take this on. I'll use @LonelyCat124 's work above as a starting point.
@arporter There's also a(t least one) bug in my implementation, the one I've found is the loop at lines 231-232. I had forgotten how StructureReference
s worked at that point, so it will need fixing. I'll post my updated implementation of that loop here.
I changes lines 215-241 to:
members.append(node.children[dummy_args.index(ref.symbol)].member.copy())
childmember = node.children[dummy_args.index(ref.symbol)].member
members[0].pop_all_children()
while isinstance(childmember, StructureMember):
childmember.detach()
childmember = childmember.member
members.append(childmember.copy())
members[-1].detach()
childmember.detach()
final_member = members[-1]
indices = []
for index in ref.walk(Reference):
if index is not ref:
self.replace_dummy_arg(index, node, dummy_args)
for index in ref.indices:
indices.append(index.copy())
array_member = ArrayMember.create(final_member.name, indices)
members[-1] = array_member
replacement = StructureReference(symbol)
add_to = replacement
for member in members:
add_to.addchild(member)
add_to = add_to.children[-1]
while isinstance(add_to, StructureMember) and len(add_to.children) > 0:
add_to = add_to.member
ref.replace_with(replacement)
The main thing that wasn't working with my previous implementation was I'd not taken into account copy()
also coping children, so it was failing because i had a structure with a Member and was also trying to add another Member which fails. There is probably a better way to avoid this occurrence by being smarter when using copy()
I'm currently having an issue with a symbol seemingly not being in the symbol_table, despite it seemingly appearing in _inline_symbols
so I'm trying to work that out.
My current issue isn't related to this (I think) so don't worry about that.
Ok, I've changed my mind and it definitely seems to be some weird behaviour between this and my other code.
In OMPTaskDirective
, I do self._inline_kernels()
, which does:
kerns = self.walk(Kern)
kintrans = KernelModuleInlineTrans()
intrans = InlineTrans()
for kern in kerns:
kintrans.apply(kern)
kern.lower_to_language_level()
calls = self.walk(Call)
for call in calls:
intrans.apply(call)
After this completes, I do self._find_parent_loop_vars()
, which calls anc._get_private_clause()
on the parent OMPParallelDirective, which is then failing at symbol = symbol_table.lookup(ref_name)
, which for some reason is failing to find a symbol that has been inlined, and as far as I can work out has been added to the symbol table.
@arporter Am I doing something wrong/is the scoping somehow wrong on these variables? I'm fairly sure that after inlining this variable should be found.
Are you sure that the symbol in question hasn't been renamed as part of the inlining? What you're doing is a really rather thorough test of both module inlining and kernel inlining so I wouldn't be at all surprised if there are issues. What does the generated Fortran look like after you've completed the two inlining steps?
I've cherry-picked @LonelyCat124's original commit into branch 1904_inline_bugs
. I've then updated it according to the comment above.
I thought I'd checked that it hadn't been name changed, but I will check again tomorrow morning.
I'll try to find a way to get the generated fortran code at that point and see how it appears.
I think I've simplified things a lot and have a working solution for all the cases mentioned so far. However, it won't yet work if an array slice is passed as an argument, e.g.:
real, dimension(10,10) :: a
call my_sub(a(:,2))
...
subroutine my_sub(array)
real, dimension(10), intent(inout) :: array
integer :: i
do i=1, 10
array(i) = i
end do