lfortran [ASR] Handle OpenMP Pragma in AST->ASR

May 05 '24 16:05 nikabot

Now, the ASR generated for the example is: Example:

!$omp parallel private(partial_sum) shared(total_sum)
    !$omp do
        do i = 1, n
            partial_sum = partial_sum + a(i)
        end do
    !$omp end do
!$omp end parallel

ASR:


(DoConcurrentLoop
    ((Var 2 i)
    (IntegerConstant 1 (Integer 4))
    (Var 3 n)
    ())
    [(Var 3 total_sum)]
    [(Var 3 partial_sum)]
    [(Assignment
        (Var 3 partial_sum)
        (IntegerBinOp
            (Var 3 partial_sum)
            Add
            (ArrayItem
                (Var 3 a)
                [(()
                (Var 2 i)
                ())]
                (Integer 4)
                ColMajor
                ()
            )
            (Integer 4)
            ()
        )
        ()
    )]
)

May 05 '24 16:05 nikabot

But, how do we handle the following,

!$omp parallel private(partial_sum) shared(total_sum)
    !$omp do
        do i = 1, n
            partial_sum = partial_sum + a(i)
        end do
    !$omp end do

    !$omp critical
        total_sum = total_sum + partial_sum
    !$omp end critical
!$omp end parallel

So, I think we need to add the following nodes to ASR.asdl

diff --git a/src/libasr/ASR.asdl b/src/libasr/ASR.asdl
index 4e22d89d3..ea801c29a 100644
--- a/src/libasr/ASR.asdl
+++ b/src/libasr/ASR.asdl
@@ -35,7 +35,7 @@ stmt
     | Cycle(identifier? stmt_name)
     | ExplicitDeallocate(expr* vars)
     | ImplicitDeallocate(expr* vars)
-    | DoConcurrentLoop(do_loop_head head, expr* shared, expr* local, stmt* body)
+    | DoConcurrentLoop(do_loop_head head, expr* shared, expr* local, omp_stmt* body)
     | DoLoop(identifier? name, do_loop_head head, stmt* body, stmt* orelse)
     | ErrorStop(expr? code)
     | Exit(identifier? stmt_name)
@@ -206,6 +206,11 @@ ttype
     | Array(ttype type, dimension* dims, array_physical_type physical_type)
     | FunctionType(ttype* arg_types, ttype? return_var_type, abi abi, deftype deftype, string? bindc_name, bool elemental, bool pure, bool module, bool inline, bool static, symbol* restrictions, bool is_restriction)
 
+omp_stmt
+    = OMPDo(stmt body)
+    | OMPCritical(stmt body)
+    -- ...

@certik What do you think about this?

May 05 '24 16:05 nikabot

I would not tie this to OpenMP in ASR, but rather as parallel constructs.

I would recommend starting with something simpler first (just a single loop). It seems for the above case we need some way to represent:

!$omp parallel private(partial_sum) shared(total_sum)
    !$omp do
        do i = 1, n
            partial_sum = partial_sum + a(i)
        end do
    !$omp end do

    !$omp critical
        total_sum = total_sum + partial_sum
    !$omp end critical
!$omp end parallel

in ASR. I don't know the best way right now. Is the loop independent of the "critical part"? Or is this something like this:

parallel block private(partial_sum) shared(total_sum)
        do concurrent i = 1, n
            partial_sum = partial_sum + a(i)
        end do

    critical block
        total_sum = total_sum + partial_sum
    end critical block
end parallel block

In other words, is this something that the Fortran language doesn't even allow (yet) natively, but if we want to represent this in ASR, we need to create a "parallel block" where the whole block runs in parallel, and the add "critical block" which is like a barrier?

For that, I would recommend to first create an explicit example like @Pranavchiku did, see how exactly this is implemented at runtime. Then abstract this to ASR, in OpenMP independent manner.

May 05 '24 17:05 certik

I would prefer to use do concurrent as of now and get things working. We can setup AST -> ASR once we have a small example fully working, it won't be tricky, also at that moment we'll have complete idea of how everything works.

May 05 '24 17:05 Pranavchiku

The basic do concurrent works: https://github.com/lfortran/lfortran/pull/3972#issuecomment-2094862072, so let's get this PR wrapped up (no ASR modifications) and merged.

May 05 '24 23:05 certik

Consider an example:

[...]
!$omp do
do i = 2, n
    print *, omp_get_thread_num(), i
    b(i) = (a(i) + a(i-1)) / 2.0
end do
!$omp end do
[...]

I get:

$  ./a.out
           0           2
           0           3
           0           4
           0           5
           0           6
           0           7
           0           8
           0           9
           0          10

So, it seems to run under the same thread. I mean, it runs serially, so I'm not mapping it to DoconcurrentLoop.

Only the following would be DoconcurrentLoop:

[...]
!$omp parallel private(i)
!$omp do
do i = 2, n
    print *, omp_get_thread_num(), i
    b(i) = (a(i) + a(i-1)) / 2.0
end do
!$omp end do
!$omp end parallel

[...]

!$omp parallel do private(i)
do i = 2, n
    print *, omp_get_thread_num(), i
    b(i) = (a(i) + a(i-1)) / 2.0
end do
!$omp end parallel do

May 06 '24 16:05 nikabot

Yes. Map everything to DoConcurrent and over time we will tastefully extend our parallel features in ASR, in an OpenMP-independent way. I believe the full do concurrent feature as specified by the Fortran standard is a strict subset of OpenMP, so let's start with this OpenMP subset and map it directly to DoConcurrent (extended to support all Fortran standard features).

Only then let's figure out how to extend ASR even beyond that to support other OpenMP features that currently do not have a Fortran language equivalent. In the ASR->Fortran backend, the user will be able to choose (say, via command line options) how the parallel ASR features should be represented as Fortran code: either do concurrent or OpenMP.

May 06 '24 16:05 certik

Yup, totally agreed!

May 06 '24 16:05 nikabot

My bad, I won't allow openmp support by default. One has to pass --openmp option to enable openmp. Otherwise it will be considered as comments.

I will make the required changes and report back.

May 07 '24 02:05 nikabot

Ready!

May 09 '24 16:05 nikabot

Thanks for the review, can you please merge it?

Thank you!

May 09 '24 17:05 nikabot