PSyclone Improve dependency analysis loop parallelisation check

See a pytest with minimal examples of the loop patterns that are still not parallelized in NEMO.

def test_arrays_parallelise2(fortran_reader):
    '''Tests the array checks of can_loop_be_parallelised.
    '''
    source = '''program test
                integer ji, jj, jk
                integer, parameter :: jpi=5, jpj=10
                real, dimension(jpi,jpi) :: mask, umask
                do ji = 1, jpi
                    jj = ji + 1
                    mask(jj, 1) = umask(jj, 1)
                end do
                do jj = 1, size(mask(:,:), 1)
                   do ji = 1, size(mask(:,:), 2)
                      mask(ji, jj) = -1.0d0
                    end do
                end do
                do jj = 1, size(mask, 1)
                   do ji = 1, size(mask, 2)
                      mask(ji, jj) = -1.0d0
                    end do
                end do
                end program test'''

    psyir = fortran_reader.psyir_from_source(source)
    loops = psyir.children[0].children[0:3]
    dep_tools = DependencyTools()

    parallel = dep_tools.can_loop_be_parallelised(
                    loops[0], only_nested_loops=False)
    assert parallel is True
    parallel = dep_tools.can_loop_be_parallelised(loops[1])
    assert parallel is True
    parallel = dep_tools.can_loop_be_parallelised(loops[2])
    assert parallel is True

The first loop has an induction variable as the array accessor. Is there something we can do about this?
The second loop is refused by the symbolic analysis because of (:). But I am wondering if being inside the SIZE intrinsic should be a case treated differently, because really the array data in not read or written. There is a more general question here: in Fortran the loops bounds are just evaluated at the beginning, so even without the size intrinstic this shouldn't be considered a READ for the loop iterations prespective.
The third case could also be using an special case for the SIZE, but I mention it separately because at the moment this breaks with:

src/psyclone/psyir/tools/dependency_tools.py:283
        for i, indx in enumerate(comp_ind1.iterate()):
>           partition_infos.append((indices_1[i].union(indices_2[i]), [indx]))
E           IndexError: list index out of range

I did saw this in the first review of the dependency analysis but then I couldn't find it anymore because in NEMO we now add explicitly array accessors to all array references. So it is not a problem in NEMO anymore, but still worth fixing.

Jun 06 '22 13:06 sergisiso

Adding a transformation that handles induction variables should be reasonable easy (for the useful cases, e.g. for an i loop: 'im = i-1', and 'n=n+k'), we just need to replace all instances in the (copy of the) psyir tree.

The others are more difficult - yes, I believe we could define a SIZE function in sympy, but the problem is that the parser chokes on the :, so we don't even reach that stage that SIZE would be used (unless we modify the tree to replace size(....) with just size() (i.e. remove any argument) before parsing. But I would prefer a better solution, since we need to handle array expressions anyway.

I intend to look at sympy again - in the past I never tried to use array as types (since I wanted to avoid having to declare functions), but by now we are doing this anyway, so maybe I could try to use sympy array types instead of handling Fortran arrays as functions, which might solve the problem.

Jun 09 '22 00:06 hiker

I just realise that some of the questions (e.g. do we consider loop bounds as a read access??) are actually not at all sympy related. As such I'd guess that we could handle SIZE to not add a read dependency. In general, I think we should add the boundaries as read access - if you don't need them, just run the access info collection on the loop body. Hmm - that might be difficult with an outer loop and several inner loops. Maybe we should add a option which can be used to indicate to e.g. ignore loop bound reads.

Jun 09 '22 00:06 hiker

I've opened #1750 and #1751 as 'smaller' todos to decouple the three bugs, and will handle the actual crash with this ticket here if you are ok with that.

Jun 09 '22 02:06 hiker

PSyclone PSyclone copied to clipboard

Improve dependency analysis loop parallelisation check

PSyclone
PSyclone copied to clipboard