OpenCoarrays icon indicating copy to clipboard operation
OpenCoarrays copied to clipboard

Defect: coarray of derived type: crash during allocation of an allocatable member

Open hassaniriad opened this issue 3 years ago • 2 comments

The title of the issue should start with Defect: followed by a succinct title.

Please make sure to put any logs, terminal output, or code in fenced code blocks. Please also read the contributing guidelines before submitting a new issue.

Please note we will close your issue without comment if you delete, do not read or do not fill out the issue checklist below and provide ALL the requested information.

  • [x] I am reporting a bug others will be able to reproduce and not asking a question or requesting a new feature.

System information including:

  • OpenCoarrays Version: output of 'caf --version': OpenCoarrays Coarray Fortran Compiler Wrapper (caf version 2.9.2) Copyright (C) 2015-2020 Sourcery Institute Copyright (C) 2015-2020 Sourcery, Inc.

  • Fortran Compiler: output of 'gfortran --version': GNU Fortran (Homebrew GCC 11.2.0_1) 11.2.0 Copyright (C) 2021 Free Software Foundation, Inc.

  • C compiler used for building lib: output of 'gcc --version': Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/c++/4.2.1 Apple clang version 11.0.0 (clang-1100.0.33.17) Target: x86_64-apple-darwin18.7.0 Thread model: posix InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

  • Installation method: homebrew

  • Output of 'caf -s': /usr/local/bin/gfortran -I/usr/local/Cellar/opencoarrays/2.9.2_1/include/OpenCoarrays-2.9.2_GNU-11.2.0 -fcoarray=lib -Wl,-flat_namespace -Wl,-commons,use_dylibs -L/usr/local/Cellar/mpich/3.4.2/lib ${@} /usr/local/Cellar/opencoarrays/2.9.2_1/lib/libcaf_mpi.a /usr/local/lib/libmpifort.dylib /usr/local/lib/libmpi.dylib /usr/local/lib/libpmpi.dylib

  • Output of 'cafrun -s': /usr/local/bin/mpiexec -n <number_of_images> /path/to/coarray_Fortran_program [arg4 [arg5 [...]]]

  • Output of uname -a: Darwin mycomputer.local 18.7.0 Darwin Kernel Version 18.7.0: Tue Jun 22 19:37:08 PDT 2021; root:xnu-4903.278.70~1/RELEASE_X86_64 x86_64

  • MPI library being used: mpich (OpenMPI unlinked) output of 'mpichversion': MPICH Version: 3.4.2 MPICH Release date: Wed May 26 15:51:40 CDT 2021 MPICH Device: ch4:ofi MPICH configure: --disable-dependency-tracking --enable-fast=all,O3 --enable-g=dbg --enable-romio --enable-shared --with-pm=hydra FC=gfortran-11 F77=gfortran-11 --disable-silent-rules --prefix=/usr/local/Cellar/mpich/3.4.2 --mandir=/usr/local/Cellar/mpich/3.4.2/share/man FFLAGS=-fallow-argument-mismatch CXXFLAGS=-Wno-deprecated CFLAGS=-fgnu89-inline -Wno-deprecated MPICH CC: clang -fgnu89-inline -Wno-deprecated -DNDEBUG -DNVALGRIND -g -O3 MPICH CXX: clang++ -Wno-deprecated -DNDEBUG -DNVALGRIND -g MPICH F77: gfortran-11 -fallow-argument-mismatch -g MPICH FC: gfortran-11 -g MPICH Custom Information:

  • Machine architecture and number of physical cores: Intel Core i7, 4 cores

To help us debug your issue please explain:

Dear Opencoarrays developers, consider the following minimal example (a module (mymod) and a main in separate files): cat mymod.f90

module mymod
   implicit none
   
   type :: my_t
      integer, allocatable :: i(:), j(:)
   end type my_t

contains

   subroutine set ( n, var )
      integer   , intent(in    ) :: n   
      type(my_t), intent(   out) :: var

      integer            :: err
      character(len=100) :: msg
   
      err = 0 ; msg = ''
      
      allocate(var%i(n), stat = err, errmsg = msg)
      if (err /= 0) error stop "in set: allocation failure (for %i): "//trim(msg)

      allocate(var%j(n), stat = err, errmsg = msg)
      if (err /= 0) error stop "in set: allocation failure (for %j): "//trim(msg)
      
      ! ...
      ! ...
      ! ...
   end subroutine set

end module mymod

cat main.f90

program foo
   use mymod
   implicit none
   
   type(my_t) :: myvar[*]       
      
   if (this_image() == 1) then
      ! allocate and set the members of myvar[1]:
      call set ( var = myvar, n = 5 )
   end if
   
   sync all
   ! ...
   ! ...
   ! ...  
   if (this_image() == 1) print*,'terminated'
end program foo

What happened (include command output, screenshots, logs, etc.)

When compiling with caf: caf mymod.f90 main.f90 -Wall -fcheck=all -fbacktrace and running with cafrun: cafrun -n 4 ./a.out an error occurs:

Program received signal SIGABRT: Process abort signal.

Backtrace for this error:
#0  0x108a5120e
#1  0x108a5041d
#2  0x7fff6493fb5c

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 58989 RUNNING AT
=   EXIT CODE: 6
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Terminated: 15 (signal 15)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
Error: Command:
   `/usr/local/bin/mpiexec -n 4 ./a.out`
failed to run.

Please also note that

  1. if I replace the intent(out) by intent(inout) an allocation error is reported:
ERROR STOP in set: allocation failure (for %j): Attempt to allocate an allocated object
Abort(1) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
Error: Command:
   `/usr/local/bin/mpiexec -n 4 ./a.out`
failed to run.
  1. the issue goes away when I use a single source file: cat mymod.f90 main.f90 > foo.f90 caf foo.f90 -Wall -fcheck=all -fbacktrace cafrun -n 4 ./a.out terminated

hassaniriad avatar Nov 05 '21 10:11 hassaniriad

@hassaniriad thanks for submitting this. I just discovered what is likely the same issue the day before you submitted this.

@vehre could this be related to your recently merged PR?

rouson avatar Nov 07 '21 04:11 rouson

Hi @rouson my latest merge was about static arrays. Here I see only allocatable ones. I am more intrigued to look for this issue in the module handling. But that is just a guess w/o having taken a decent look into it. When you want me to analyse and work on this, just say so.

vehre avatar Nov 08 '21 08:11 vehre