Castro
Castro copied to clipboard
Errors using OpenMP on Sedov test
Hello,
Hello I'm getting this error when trying to run Sedov 3D MHD with OpenMP
amrex::Error::0::Couldn't open file: sedov_3d_plt00000.temp/Level_0/Cell_D_00000 !!!
SIGABRT
amrex::Error::0::Couldn't open file: sedov_3d_plt00000.temp/Level_0/Cell_D_00000 !!!
SIGABRT
amrex::Error::0::Couldn't open file: sedov_3d_plt00000.temp/Level_0/Cell_D_00000 !!!
SIGABRT
See Backtrace.0.0 file for details
See Backtrace.0.0 file for details
See Backtrace.0.0 file for details
amrex::Error::0::Couldn't open file: sedov_3d_plt00050.temp/Level_0/Cell_D_00000 !!!
SIGABRT
amrex::Error::0::Couldn't open file: sedov_3d_plt00050.temp/Level_0/Cell_D_00000 !!!
SIGABRT
See Backtrace.0.0 file for details
See Backtrace.0.0 file for details
amrex::Error::0::Couldn't open file: sedov_3d_plt00062.temp/Level_0/Cell_D_00000 !!!
SIGABRT
amrex::Error::0::Couldn't open file: sedov_3d_plt00062.temp/Level_0/Cell_D_00000 !!!
SIGABRT
amrex::Error::0::Couldn't open file: sedov_3d_plt00062.temp/Level_0/Cell_D_00000 !!!
SIGABRT
See Backtrace.0.0 file for details
See Backtrace.0.0 file for details
See Backtrace.0.0 file for details
I'm getting an error in the backtrace when trying to use OpenMP.
In Backtrace.0.0 I see this
=== If no file names and line numbers are shown below, one can run
addr2line -Cpfie my_exefile my_line_address
to convert `my_line_address` (e.g., 0x4a6b) into file name and line number.
Or one can use amrex/Tools/Backtrace/parse_bt.py.
=== Please note that the line number reported by addr2line may not be accurate.
One can use
readelf -wl my_exefile | grep my_line_address'
to find out the offset for that line.
0: ./Castro3d.gnu.OMP.ex(+0x1e2d65) [0x55e884552d65]
amrex::BLBackTrace::print_backtrace_info(_IO_FILE*) at /shared/castro-21.11/Castro/Exec/hydro_tests/Sedov/../../../external/amrex/Src/Base/AMReX_BLBackTrace.cpp:179
1: ./Castro3d.gnu.OMP.ex(+0x1e4b08) [0x55e884554b08]
amrex::BLBackTrace::handler(int) at /shared/castro-21.11/Castro/Exec/hydro_tests/Sedov/../../../external/amrex/Src/Base/AMReX_BLBackTrace.cpp:85
2: ./Castro3d.gnu.OMP.ex(+0xe22a7) [0x55e8844522a7]
amrex::Error_host(char const*) at /shared/castro-21.11/Castro/Exec/hydro_tests/Sedov/../../../external/amrex/Src/Base/AMReX.cpp:221
3: ./Castro3d.gnu.OMP.ex(+0x1119e3) [0x55e8844819e3]
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_is_local() const at /usr/include/c++/9/bits/basic_string.h:222
(inlined by) std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_dispose() at /usr/include/c++/9/bits/basic_string.h:231
(inlined by) std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() at /usr/include/c++/9/bits/basic_string.h:658
(inlined by) amrex::FileOpenFailed(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /shared/castro-21.11/Castro/Exec/hydro_tests/Sedov/../../../external/amrex/Src/Base/AMReX_Utility.cpp:167
4: ./Castro3d.gnu.OMP.ex(+0x14e80d) [0x55e8844be80d]
amrex::NFilesIter::ReadyToWrite(bool) at /shared/castro-21.11/Castro/Exec/hydro_tests/Sedov/../../../external/amrex/Src/Base/AMReX_NFiles.cpp:321
5: ./Castro3d.gnu.OMP.ex(+0x1418bf) [0x55e8844b18bf]
amrex::VisMF::Write(amrex::FabArray<amrex::FArrayBox> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, amrex::VisMF::How, bool) at /shared/castro-21.11/Castro/Exec/hydro_tests/Sedov/../../../external/amrex/Src/Base/AMReX_VisMF.cpp:1007
6: ./Castro3d.gnu.OMP.ex(+0x5a8fe) [0x55e8843ca8fe]
Castro::plotFileOutput(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::ostream&, amrex::VisMF::How, int) at /shared/castro-21.11/Castro/Exec/hydro_tests/Sedov/../../../Source/driver/Castro_io.cpp:1156
7: ./Castro3d.gnu.OMP.ex(+0x2523ac) [0x55e8845c23ac]
amrex::Amr::writePlotFileDoit(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) at /shared/castro-21.11/Castro/Exec/hydro_tests/Sedov/../../../external/amrex/Src/Amr/AMReX_Amr.cpp:995 (discriminator 2)
8: ./Castro3d.gnu.OMP.ex(+0x252d82) [0x55e8845c2d82]
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_is_local() const at /usr/include/c++/9/bits/basic_string.h:222
(inlined by) std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_dispose() at /usr/include/c++/9/bits/basic_string.h:231
(inlined by) std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() at /usr/include/c++/9/bits/basic_string.h:658
(inlined by) amrex::Amr::writePlotFile() at /shared/castro-21.11/Castro/Exec/hydro_tests/Sedov/../../../external/amrex/Src/Amr/AMReX_Amr.cpp:880
9: ./Castro3d.gnu.OMP.ex(+0x23256) [0x55e884393256]
main at /shared/castro-21.11/Castro/Exec/hydro_tests/Sedov/../../../Source/driver/main.cpp:160
10: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f1bd3bca0b3]
11: ./Castro3d.gnu.OMP.ex(+0x2ae6e) [0x55e88439ae6e]
?? ??:0
The only thing I added to inputs.3d.mhd is
castro.max_subcycles = 16
My GNUmakefile is
PRECISION = DOUBLE
PROFILE = FALSE
DEBUG = FALSE
DIM = 3
COMP = gnu
USE_MPI = FALSE
USE_OMP = TRUE
USE_MHD = TRUE
USE_FORT_MICROPHYSICS := FALSE
BL_NO_FORT := TRUE
# define the location of the CASTRO top directory
CASTRO_HOME := ../../..
# This sets the EOS directory in $(MICROPHYSICS_HOME)/EOS
EOS_DIR := gamma_law
# This sets the network directory in $(MICROPHYSICS_HOME)/Networks
NETWORK_DIR := general_null
NETWORK_INPUTS = gammalaw.net
Bpack := ./Make.package
Blocs := .
include $(CASTRO_HOME)/Exec/Make.Castro
I can't seem to reproduce this. There might be a race condition somewhere or a filesystem issue, but when I run with 16 OpenMP threads, I can output the plotfile without issue.
Can you tell me what machine you ran on, how many MPI tasks and how many OpenMP threads?
oh, I just noticed you are running without MPI, just OpenMP. I tried that and I also have no problem outputting.
Ah, the problem was using mpirun instead of just running the executable. After fixing that the simulation works, but I'm not seeing any improvement in Wall-time for 1, 2, 4, 8 and 12 OpenMP threads. It's not a big deal for me right now, I'm just try to assess how Castro scales for an XSEDE allocation right now. MPI scales well.
it looks like we never tiled the MHD algorithm, since we were mostly interested running it on GPUs. I'll look at adding the tiling tonight.
Is that easy to do? Perhaps I could do it.
it may be as simple as adding
TilingIfNotGPU()
to the single MFIter
loop in Castro_mhd.cpp
. But I am not 100% certain.
OK. I'll setup a fork and give it a shot.
That seems to have helped quite a bit, although on my system scalability does seem to stall out with over 8 cores.
castro-sedovdev-1x1.omp.out:3727:Run time without initialization = 199.2570635
castro-sedovdev-1x2.omp.out:3727:Run time without initialization = 110.7894975
castro-sedovdev-1x4.omp.out:3727:Run time without initialization = 61.3167444
castro-sedovdev-1x8.omp.out:3727:Run time without initialization = 35.07138379
castro-sedovdev-1x12.omp.out:3727:Run time without initialization = 33.61117978
Should I create a pull request related to this issue, or should I make a new issue?
The way tiling works is that it divides a box up into logical tiles and then distributes those tiles across the OMP threads. If you are running the default inputs.3d.mhd
, then you have a single 32^3 box. The default tile size is 1024x8x8, so that would give you 16 tiles for OMP, which don't spread nicely over 12 cores. Not sure if your chip has 16 cores, but OMP might work better there.
you could try setting
fabarray.mfiter_tile_size = 1024 4 4
which would give you 64 tiles, but at some point it might simply be too small to scale effectively.
I tried the setting but it actually made it worse (43 seconds instead of 33). The node I'm running on only has 12 non-hyperthreaded cores, so maybe there is some sort of contention with other processes running on that nodes (i.e. Slurm, NFS client, etc.). I think the pull request for the fix is #2039.