euler_kokkos icon indicating copy to clipboard operation
euler_kokkos copied to clipboard

segmentation error for using the cuda backend

Open wangyf opened this issue 2 years ago • 1 comments

Hi, I am trying to learn and use your MPI+kokkos code. It went well for MPI+openmp in kokkos but failed for MPI+cuda. Here is what I got: Do you have sense what's wrong for cuda backend? Thanks

I'm MPI task #3 (out of 4) pinned to GPU #0 (out of 4) pinned to GPU #0 We are about to start simulation with the following characteristics Global resolution : 256 x 256 x 1 Local resolution : 128 x 128 x 1 MPI Cartesian topology : 2x2x1 [c196-012:10061:0:10061] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x2ab54d4c3e80) [c196-012:10062:0:10062] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x2ac7db4c3e80) [c196-012:10063:0:10063] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x2b64db4c3e80) [c196-012:10064:0:10064] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x2b131d4c3e80) ==== backtrace (tid: 10064) ==== 0 0x000000000004cb95 ucs_debug_print_backtrace() ???:0 1 0x000000000089f648 I_MPI_memcpy_movsb() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/posix/eager/include/i_mpi_memcpy_sse.h:11 2 0x000000000089f648 bdw_memcpy_write() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/posix/eager/include/intel_transport_memcpy.h:146 3 0x000000000089bce9 write_to_cell() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/posix/eager/include/intel_transport_memcpy.h:326 4 0x000000000089bce9 send_cell() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/posix/eager/include/intel_transport_send.h:890 5 0x00000000008959a4 MPIDI_POSIX_eager_send() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/posix/eager/include/intel_transport_send.h:1540 6 0x0000000000755399 MPIDI_POSIX_eager_send() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/posix/eager/include/posix_eager_impl.h:37 7 0x0000000000755399 MPIDI_POSIX_am_isend() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/src/../src/../posix/posix_am.h:220 8 0x0000000000755399 MPIDI_SHM_am_isend() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/src/../src/shm_am.h:49 9 0x0000000000755399 MPIDIG_isend_impl() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/generic/mpidig_send.h:116 10 0x000000000075870e MPIDIG_am_isend() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/generic/mpidig_send.h:172 11 0x000000000075870e MPIDIG_mpi_isend() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/generic/mpidig_send.h:233 12 0x000000000075870e MPIDI_POSIX_mpi_isend() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/src/../src/../posix/posix_send.h:59 13 0x000000000075870e MPIDI_SHM_mpi_isend() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/src/../src/shm_p2p.h:187 14 0x000000000075870e MPIDI_isend_unsafe() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/src/ch4_send.h:314 15 0x000000000075870e MPIDI_isend_safe() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/src/ch4_send.h:609 16 0x000000000075870e MPID_Isend() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/src/ch4_send.h:828 17 0x000000000075870e PMPI_Sendrecv() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpi/pt2pt/sendrecv.c:181 18 0x00000000004c453a hydroSimu::MpiComm::sendrecv() ???:0 19 0x0000000000491e9b euler_kokkos::SolverBase::transfert_boundaries_2d() ???:0 20 0x00000000004a1877 euler_kokkos::SolverBase::make_boundaries_mpi() ???:0 21 0x000000000044d786 euler_kokkos::muscl::SolverHydroMuscl<2>::make_boundaries() ???:0 22 0x0000000000445223 euler_kokkos::muscl::SolverHydroMuscl<2>::SolverHydroMuscl() ???:0 23 0x0000000000445d05 euler_kokkos::muscl::SolverHydroMuscl<2>::create() ???:0 24 0x00000000004152e8 euler_kokkos::SolverFactory::create() ???:0 25 0x00000000004116e6 main() ???:0 26 0x0000000000022555 __libc_start_main() ???:0 27 0x0000000000414fec _start() ???:0

================================= ==== backtrace (tid: 10063) ==== 0 0x000000000004cb95 ucs_debug_print_backtrace() ???:0 1 0x000000000089f648 I_MPI_memcpy_movsb() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/posix/eager/include/i_mpi_memcpy_sse.h:11 2 0x000000000089f648 bdw_memcpy_write() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/posix/eager/include/intel_transport_memcpy.h:146 3 0x000000000089bce9 write_to_cell() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/posix/eager/include/intel_transport_memcpy.h:326 4 0x000000000089bce9 send_cell() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/posix/eager/include/intel_transport_send.h:890 5 0x00000000008959a4 MPIDI_POSIX_eager_send() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/posix/eager/include/intel_transport_send.h:1540 6 0x0000000000755399 MPIDI_POSIX_eager_send() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/posix/eager/include/posix_eager_impl.h:37 7 0x0000000000755399 MPIDI_POSIX_am_isend() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/src/../src/../posix/posix_am.h:220 8 0x0000000000755399 MPIDI_SHM_am_isend() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/src/../src/shm_am.h:49 9 0x0000000000755399 MPIDIG_isend_impl() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/generic/mpidig_send.h:116 10 0x000000000075870e MPIDIG_am_isend() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/generic/mpidig_send.h:172 11 0x000000000075870e MPIDIG_mpi_isend() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/generic/mpidig_send.h:233 12 0x000000000075870e MPIDI_POSIX_mpi_isend() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/src/../src/../posix/posix_send.h:59 13 0x000000000075870e MPIDI_SHM_mpi_isend() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/shm/src/../src/shm_p2p.h:187 14 0x000000000075870e MPIDI_isend_unsafe() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/src/ch4_send.h:314 15 0x000000000075870e MPIDI_isend_safe() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/src/ch4_send.h:609 16 0x000000000075870e MPID_Isend() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpid/ch4/src/ch4_send.h:828 17 0x000000000075870e PMPI_Sendrecv() /localdisk/jenkins/workspace/workspace/ch4-build-linux-2019/impi-ch4-build-linux_build/CONF/impi-ch4-build-linux-release/label/impi-ch4-build-linux-intel64/_buildspace/release/../../src/mpi/pt2pt/sendrecv.c:181 18 0x00000000004c453a hydroSimu::MpiComm::sendrecv() ???:0 19 0x0000000000491e9b euler_kokkos::SolverBase::transfert_boundaries_2d() ???:0 20 0x00000000004a1877 euler_kokkos::SolverBase::make_boundaries_mpi() ???:0 21 0x000000000044d786 euler_kokkos::muscl::SolverHydroMuscl<2>::make_boundaries() ???:0 22 0x0000000000445223 euler_kokkos::muscl::SolverHydroMuscl<2>::SolverHydroMuscl() ???:0 23 0x0000000000445d05 euler_kokkos::muscl::SolverHydroMuscl<2>::create() ???:0 24 0x00000000004152e8 euler_kokkos::SolverFactory::create() ???:0 25 0x00000000004116e6 main() ???:0 26 0x0000000000022555 __libc_start_main() ???:0 27 0x0000000000414fec _start() ???:0

wangyf avatar Apr 28 '22 03:04 wangyf