dbcsr icon indicating copy to clipboard operation
dbcsr copied to clipboard

Incorrrectly Master-workshare nesting

Open dmishura opened this issue 1 year ago • 3 comments

Found in code evaluation with CP2K. Code hung with latest ifx compiler. Caused by nested OpenMP regions that doesn't allow by standard. Code path contain multiple source files, so only short snippets here.

  1. Parallel section creation dbcsr_finalize in dbscr_work_operations.F
!$OMP           PARALLEL DEFAULT (NONE) &
!$OMP                    SHARED (matrix, old_row_p, old_col_i, old_blk_p,&
!$OMP                            start_offset, sort_data)
            CALL dbcsr_merge_all(matrix, &
                                 old_row_p, old_col_i, old_blk_p, &
                                 sort_data=sort_data)
!$OMP           END PARALLEL
  1. Create Master section dbcsr_merge_all in dbscr_work_operations.F
!$OMP     MASTER
      CALL dbcsr_clearfrom_index_array(matrix, dbcsr_slot_row_p)
      CALL dbcsr_clearfrom_index_array(matrix, dbcsr_slot_col_i)
      CALL dbcsr_clearfrom_index_array(matrix, dbcsr_slot_blk_p)
      CALL dbcsr_addto_index_array(matrix, dbcsr_slot_row_p, &
                                   DATA=new_row_p(1:nrows + 1), extra=new_nblks*2)
      CALL dbcsr_addto_index_array(matrix, dbcsr_slot_col_i, &
                                   DATA=new_col_i(1:new_nblks))
      IF (sort_data) THEN
         CALL dbcsr_addto_index_array(matrix, dbcsr_slot_blk_p, &
                                      DATA=new_blk_p_sorted)
      ELSE
         CALL dbcsr_addto_index_array(matrix, dbcsr_slot_blk_p, &
                                      DATA=new_blk_p)
      END IF
      matrix%nblks = new_nblks
      matrix%nze = new_nze
      matrix%index(dbcsr_slot_nblks) = matrix%nblks
      matrix%index(dbcsr_slot_nze) = matrix%nze
      CALL dbcsr_repoint_index(matrix)
!$OMP     END MASTER

  1. Transit dbcsr_addto_index_array in dbcsr_index_operations.F
         CALL ensure_array_size(matrix%index, lb=1, ub=ub_new, &
                                factor=default_resize_factor, &
                                nocopy=.FALSE., &
                                memory_type=matrix%index_memory_type)

  1. Transit ensure_array_size in dbcsr_ptr_util.F
            IF (ub_orig - lb_orig + 1 .gt. 0) THEN
               !newarray(lb_orig:ub_orig) = array(lb_orig:ub_orig)
               CALL mem_copy_${nametype1}$ (newarray(lb_orig:ub_orig), &
                                            array(lb_orig:ub_orig), ub_orig - lb_orig + 1)
            END IF

  1. Create workshare section mem_copy_${nametype1}$ in dbcsr_ptr_util.F
      SUBROUTINE mem_copy_${nametype1}$ (dst, src, n)
     !! Copies memory area

         INTEGER, INTENT(IN) :: n
        !! length of copy
         ${type1}$, DIMENSION(1:n), INTENT(OUT) :: dst
        !! destination memory
         ${type1}$, DIMENSION(1:n), INTENT(IN) :: src
        !! source memory
#if !defined(__DBCSR_DISABLE_WORKSHARE)
!$OMP     PARALLEL WORKSHARE DEFAULT(none) SHARED(dst,src)
#endif
         dst(:) = src(:)
#if !defined(__DBCSR_DISABLE_WORKSHARE)
!$OMP     END PARALLEL WORKSHARE
#endif
      END SUBROUTINE mem_copy_${nametype1}$

Situation when __DBCSR_DISABLE_WORKSHARE macros is not defined violated OpenMP standard that says:

" DO, SECTIONS, SINGLE, and WORKSHARE directives are not permitted in the dynamic extent of CRITICAL, ORDERED, and MASTER directives"

Also please note that MASTER construct is deprecated in latest spec.

Fortran pretty old spec: https://www.openmp.org/wp-content/uploads/fspec20.pdf

dmishura avatar Oct 14 '24 09:10 dmishura

@dmishura Awesome - thank you, Dmitry!

hfp avatar Oct 15 '24 08:10 hfp

I explored an approach to use CP2K's convention checker and essentially ban nested parallelism or more specifically a parallel region inside of a master section. However, the GFortran AST cannot be combined with a call graph query unless much additional work is invested. Alternative approach is to rely on an assertion.

hfp avatar Oct 16 '24 08:10 hfp

Also please note that MASTER construct is deprecated in latest spec.

This questions the whole MPI-like programming model in OpenMP based on get_get_thread_num. This is about "OpenMP tasks everywhere" to enable composable parallelism. Nice dream.

( That's too much to ask for ;-)

hfp avatar Oct 16 '24 11:10 hfp

The current thesis is, nested parallelism like PARALLEL WORKSHARE even inside inside of a MASTER section (which in turn is inside of another PARALLEL region) should never hang. GNU Fortran as well as IFORT seems to confirm this at runtime. However, GNU, IFORT, and IFX produce an error message at compile-time if the nest is not "firewalled" by a subroutine but literally nested (second code-path when writing #if 0):

      PROGRAM NESTED
        IMPLICIT NONE
        INTEGER :: dst(24), src(24), i

        DO i = 1, SIZE(src)
          src(i) = i - 1
        END DO

!$OMP   PARALLEL
!$OMP   MASTER
#if 1
        CALL workshare(dst, src)
#else
!$OMP   PARALLEL WORKSHARE
        dst(:) = src(:)
!$OMP   END PARALLEL WORKSHARE
#endif
!$OMP   END MASTER
!$OMP   END PARALLEL

        DO i = 1, SIZE(dst)
          WRITE(*,*) dst(i)
        END DO

      CONTAINS
        SUBROUTINE workshare(dst, src)
          INTEGER, INTENT(OUT) :: dst(:)
          INTEGER, INTENT(IN)  :: src(:)
!$OMP     PARALLEL WORKSHARE
          dst(:) = src(:)
!$OMP     END PARALLEL WORKSHARE
        END SUBROUTINE
      END PROGRAM

The confusion in compiler's implementation seems to stem from PARALLEL WORKSHARE vs. just WORKSHARE, with the latter indeed incorrect inside of a master/single section.

hfp avatar Dec 02 '24 10:12 hfp

Nice investigation @hfp ! I would say we can remove all WORKSHARE section, we have memory-boud anyway... I will work on DBCSR starting this week...

alazzaro avatar Dec 02 '24 10:12 alazzaro

Also please note that MASTER construct is deprecated in latest spec.

Fortran pretty old spec: https://www.openmp.org/wp-content/uploads/fspec20.pdf

Thanks for pointing this out. I would not change MASTER to the new MASKED construct, at least not with the next release... I will open a specific issue as a reminder.

alazzaro avatar Dec 09 '24 17:12 alazzaro