[ASR] Handle OpenMP Pragma in AST->ASR
Now, the ASR generated for the example is: Example:
!$omp parallel private(partial_sum) shared(total_sum)
!$omp do
do i = 1, n
partial_sum = partial_sum + a(i)
end do
!$omp end do
!$omp end parallel
ASR:
(DoConcurrentLoop
((Var 2 i)
(IntegerConstant 1 (Integer 4))
(Var 3 n)
())
[(Var 3 total_sum)]
[(Var 3 partial_sum)]
[(Assignment
(Var 3 partial_sum)
(IntegerBinOp
(Var 3 partial_sum)
Add
(ArrayItem
(Var 3 a)
[(()
(Var 2 i)
())]
(Integer 4)
ColMajor
()
)
(Integer 4)
()
)
()
)]
)
But, how do we handle the following,
!$omp parallel private(partial_sum) shared(total_sum)
!$omp do
do i = 1, n
partial_sum = partial_sum + a(i)
end do
!$omp end do
!$omp critical
total_sum = total_sum + partial_sum
!$omp end critical
!$omp end parallel
So, I think we need to add the following nodes to ASR.asdl
diff --git a/src/libasr/ASR.asdl b/src/libasr/ASR.asdl
index 4e22d89d3..ea801c29a 100644
--- a/src/libasr/ASR.asdl
+++ b/src/libasr/ASR.asdl
@@ -35,7 +35,7 @@ stmt
| Cycle(identifier? stmt_name)
| ExplicitDeallocate(expr* vars)
| ImplicitDeallocate(expr* vars)
- | DoConcurrentLoop(do_loop_head head, expr* shared, expr* local, stmt* body)
+ | DoConcurrentLoop(do_loop_head head, expr* shared, expr* local, omp_stmt* body)
| DoLoop(identifier? name, do_loop_head head, stmt* body, stmt* orelse)
| ErrorStop(expr? code)
| Exit(identifier? stmt_name)
@@ -206,6 +206,11 @@ ttype
| Array(ttype type, dimension* dims, array_physical_type physical_type)
| FunctionType(ttype* arg_types, ttype? return_var_type, abi abi, deftype deftype, string? bindc_name, bool elemental, bool pure, bool module, bool inline, bool static, symbol* restrictions, bool is_restriction)
+omp_stmt
+ = OMPDo(stmt body)
+ | OMPCritical(stmt body)
+ -- ...
@certik What do you think about this?
I would not tie this to OpenMP in ASR, but rather as parallel constructs.
I would recommend starting with something simpler first (just a single loop). It seems for the above case we need some way to represent:
!$omp parallel private(partial_sum) shared(total_sum)
!$omp do
do i = 1, n
partial_sum = partial_sum + a(i)
end do
!$omp end do
!$omp critical
total_sum = total_sum + partial_sum
!$omp end critical
!$omp end parallel
in ASR. I don't know the best way right now. Is the loop independent of the "critical part"? Or is this something like this:
parallel block private(partial_sum) shared(total_sum)
do concurrent i = 1, n
partial_sum = partial_sum + a(i)
end do
critical block
total_sum = total_sum + partial_sum
end critical block
end parallel block
In other words, is this something that the Fortran language doesn't even allow (yet) natively, but if we want to represent this in ASR, we need to create a "parallel block" where the whole block runs in parallel, and the add "critical block" which is like a barrier?
For that, I would recommend to first create an explicit example like @Pranavchiku did, see how exactly this is implemented at runtime. Then abstract this to ASR, in OpenMP independent manner.
I would prefer to use do concurrent as of now and get things working. We can setup AST -> ASR once we have a small example fully working, it won't be tricky, also at that moment we'll have complete idea of how everything works.
The basic do concurrent works: https://github.com/lfortran/lfortran/pull/3972#issuecomment-2094862072, so let's get this PR wrapped up (no ASR modifications) and merged.
Consider an example:
[...]
!$omp do
do i = 2, n
print *, omp_get_thread_num(), i
b(i) = (a(i) + a(i-1)) / 2.0
end do
!$omp end do
[...]
I get:
$ ./a.out
0 2
0 3
0 4
0 5
0 6
0 7
0 8
0 9
0 10
So, it seems to run under the same thread. I mean, it runs serially, so I'm not mapping it to DoconcurrentLoop.
Only the following would be DoconcurrentLoop:
[...]
!$omp parallel private(i)
!$omp do
do i = 2, n
print *, omp_get_thread_num(), i
b(i) = (a(i) + a(i-1)) / 2.0
end do
!$omp end do
!$omp end parallel
[...]
!$omp parallel do private(i)
do i = 2, n
print *, omp_get_thread_num(), i
b(i) = (a(i) + a(i-1)) / 2.0
end do
!$omp end parallel do
Yes. Map everything to DoConcurrent and over time we will tastefully extend our parallel features in ASR, in an OpenMP-independent way. I believe the full do concurrent feature as specified by the Fortran standard is a strict subset of OpenMP, so let's start with this OpenMP subset and map it directly to DoConcurrent (extended to support all Fortran standard features).
Only then let's figure out how to extend ASR even beyond that to support other OpenMP features that currently do not have a Fortran language equivalent. In the ASR->Fortran backend, the user will be able to choose (say, via command line options) how the parallel ASR features should be represented as Fortran code: either do concurrent or OpenMP.
Yup, totally agreed!
My bad, I won't allow openmp support by default. One has to pass --openmp option to enable openmp. Otherwise it will be considered as comments.
I will make the required changes and report back.
Ready!
Thanks for the review, can you please merge it?
Thank you!