OpenCoarrays icon indicating copy to clipboard operation
OpenCoarrays copied to clipboard

Defect: sendget_by_ref() incorrect assignment on LHS when rank == 2

Open nathanweeks opened this issue 6 years ago • 4 comments

  • [x] I am reporting a bug others will be able to reproduce and not asking a question or requesting a new feature.

The sendget_by_ref issue first reported in https://github.com/sourceryinstitute/OpenCoarrays/issues/632#issuecomment-474827958 still applies in OpenCoarrays as of d13375d.

This routine is exercised when a coarray of derived type with allocatable component is on both the LHS and the RHS of an assignment statement.

Specifically, there is a problem with the result of the assignment (LHS), at least when the allocatable component on the LHS is an array of rank == 2:

program test_sendget_by_ref
  implicit none
  type :: rank1_type
    integer, allocatable :: A(:)
  end type
  type :: rank2_type
    integer, allocatable :: A(:,:)
  end type
  type(rank1_type) :: R_get[*]
  type(rank2_type) :: R_send[*]
  integer :: i, j

  allocate(R_get%A(this_image()), source=-1)
  R_get%A(this_image()) = this_image()

  allocate(R_send%A(num_images(),num_images()), source=-2)

  sync all

  do i = 1, num_images()
    do j = 1, num_images()
      R_send[i]%A(j,this_image()) = R_get[j]%A(j)
    end do
  end do

  sync all

  write(*,*) this_image(), ':', R_get%A, '|', R_send%A
end program test_sendget_by_ref

Output:

$ caf test_sendget_by_ref.f90
$ cafrun -np 3 ./a.out  | sort -k 1n,1n
           1 :           1 |           1          -2           2          -2           3          -2          -2          -2          -2
           2 :          -1           2 |           1          -2           2          -2           3          -2          -2          -2          -2
           3 :          -1          -1           3 |           1          -2           2          -2           3          -2          -2          -2          -2

The values for R_get%A (before the "|" in preceding output) look correct; however, the values of the R_send%A array (after the "|" in the preceding output) should be for all images:

1          2           3          1           2          3          1          2          3

System information

  • OpenCoarrays Version: 2.6.1-30-gd13375d
  • Fortran Compiler: GFortran 8.3.0
  • C compiler used for building lib: GCC 8.3.0
  • Installation method: FC=gfortran CC=gcc cmake .. -DCMAKE_BUILD_TYPE=Debug
  • All flags & options passed to the installer: N/A
  • MPI library being used: MPICH 3.3
  • Machine architecture and number of physical cores: x86_64, 2 cores (4 threads)
  • Version of CMake: 3.13.4

nathanweeks avatar Apr 22 '19 11:04 nathanweeks

Thanks for the report Nathan!

zbeekman avatar Apr 24 '19 20:04 zbeekman

It took me some time to understand your example: The allocation allocate(R_get%A(this_image()), (different sizes on different images, e.g. imperfect partitioning) does appear to be allowed with non-symmetric coarrays. Thus, no problem with that.

do i = 1, num_images()
  do j = 1, num_images()
!      R_send[i]%A(j,this_image()) = R_get[j]%A(j)
      R_send[i]%A(j,i) = R_get[j]%A(j)
  end do
end do

Here, I did replace this_image() with i, and got the same output as yours above. Thus, the runtime does seem to not use this_image() for the index. Isn't it?

MichaelSiehl avatar Apr 26 '19 22:04 MichaelSiehl

And after some further testing it seems that the problem is deeper and not related to the use of this_image()? Maybe a failure to map the distributed non-symmetric memory addressing? I can only guess. See the paper 'Rationale for Co-Arrays in Fortran 2008' by Aleksandar Donev, chapter 3.2.

MichaelSiehl avatar Apr 27 '19 09:04 MichaelSiehl

IIRC, the image index is getting passed correctly to sendget_by_ref(), so it seems the issue is somewhere in subsequent code that puts data into the right spot on the remote image specified in the LHS of the assignment.

nathanweeks avatar Apr 27 '19 23:04 nathanweeks