OpenCoarrays
OpenCoarrays copied to clipboard
Defect: Spurious coarray bounds error with -fcheck=bounds
- [x] I am reporting a bug others will be able to reproduce and not asking a question or requesting a new feature.
System information including:
-
OpenCoarrays Version: caf version 2.9.2-13-g235167d
-
Fortran Compiler: GNU Fortran (GCC) 11.2.0
-
C compiler used for building lib: gcc (GCC) 11.2.0
-
Installation method:
make; make install -
All flags & options passed to the installer:
cmake -DCMAKE_BUILD_TYPE=Release -
Output of
uname -a:Linux thelio.indiana 5.14.18-100.fc33.x86_64 #1 SMP Fri Nov 12 17:38:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux -
MPI library being used: MPICH 3.3.2
-
Machine architecture and number of physical cores: 12 core AMD Threadripper 2920X
-
Version of CMake: 3.19.7
To help us debug your issue please explain:
What you were trying to do (and why)
I am developing code to perform a halo exchange associated with domain decomposition methods for PDEs and have encountered a problem. To debug the problem I've extracted the problem loop into a subroutine and wrapped it with a main program that initializes some structure arrays with data from a debugging data set for 4 images. The subroutine loop is performing the equivalent of an MPI_Alltoallv or MPI_Neighbor_alltoall.After calling the subroutine, the program compares the result of the subroutine with reference data to check for a successful completion. Here is the problem reproducer:
program main
type :: index_map
integer, allocatable :: offP_index(:)
integer, allocatable :: recv_displs(:), recv_counts(:)
integer, allocatable :: send_displs(:), send_image(:), send_index(:)
integer, allocatable :: send_index_ref(:)
end type
type(index_map) :: imap
select case (this_image())
case (1)
imap%send_image = [2, 3, 4]
imap%recv_counts = [3, 5, 4]
imap%recv_displs = [0, 3, 8]
imap%send_displs = [0, 0, 0]
imap%offP_index = [8, 10, 12, 13, 14, 16, 18, 19, 21, 23, 25, 27]
imap%send_index_ref = [1, 2, 3, 4, 5, 6, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6]
case (2)
imap%send_image = [1, 3, 4]
imap%recv_counts = [6, 2, 4]
imap%recv_displs = [0, 6, 8]
imap%send_displs = [0, 5, 4]
imap%offP_index = [1, 2, 3, 4, 5, 6, 15, 16, 20, 21, 24, 25]
imap%send_index_ref = [8, 10, 12, 9, 10, 11, 12, 7, 8, 9, 10, 11, 12]
case (3)
imap%send_image = [1, 2, 4]
imap%recv_counts = [5, 4, 6]
imap%recv_displs = [0, 5, 9]
imap%send_displs = [6, 3, 8]
imap%offP_index = [2, 3, 4, 5, 6, 9, 10, 11, 12, 21, 23, 24, 25, 26, 27]
imap%send_index_ref = [13, 14, 16, 18, 19, 15, 16, 13, 14, 15, 16, 17, 18, 19]
case (4)
imap%send_image = [1, 2, 3]
imap%recv_counts = [6, 6, 7]
imap%recv_displs = [0, 6, 12]
imap%send_displs = [11, 7, 7]
imap%offP_index = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
imap%send_index_ref = [21, 23, 25, 27, 20, 21, 24, 25, 21, 23, 24, 25, 26, 27]
end select
allocate(imap%send_index, mold=imap%send_index_ref)
imap%send_index = 0
call sub(imap)
if (any(imap%send_index /= imap%send_index_ref)) error stop
contains
subroutine sub(this)
type(index_map), intent(inout), target :: this
integer :: j, k
type :: box
integer, pointer :: array(:)
end type
type(box), allocatable :: buffer[:]
allocate(buffer[*])
buffer%array => this%send_index
sync all
do j = 1, size(this%recv_counts)
associate (i => this%send_image(j), n => this%recv_counts(j), &
s0 => this%send_displs(j), r0 => this%recv_displs(j))
do k = 1, n
!write(*,'(i0,a,5(1x,i0))') this_image(), ': ', j, i, &
! lbound(buffer[i]%array), s0+k, ubound(buffer[i]%array)
buffer[i]%array(s0+k) = this%offP_index(r0+k)
end do
end associate
end do
sync all
end subroutine
end program
What happened (include command output, screenshots, logs, etc.)
When compiled with -fcheck=bounds the runtime reports a spurious out-of-bounds error:
At line 69 of file gfortran-20220127.f90
Fortran runtime error: Index '15' of dimension 1 of array 'buffer%array' above upper bound of 14
Error termination. Backtrace:
#0 0x402b81 in ???
#1 0x4050b5 in ???
#2 0x40545e in ???
#3 0x7f325e2981e1 in ???
#4 0x40253d in ???
#5 0xffffffffffffffff in ???
Error: Command:
`/opt/lib/gcc-11_2/mpich/3.3.2/bin/mpiexec -n 4 ./a.out`
failed to run.
What you expected to happen
The program should run to completion without error. That there is no actual out-of-bounds reference can be confirmed by uncommenting the write statement and verifying the index in each iteration of the loop lies within the array bounds -- until it aborts on the following line. Also the NAG and Intel compilers report no out-of-bounds access.
Step-by-step reproduction instructions to reproduce the error/bug
caf -fcheck=bounds reproducer.f90
cafrun -n 4 ./a.out