ADIOS2 icon indicating copy to clipboard operation
ADIOS2 copied to clipboard

bp4: integrate SCR calls

Open adammoody opened this issue 2 years ago • 4 comments

@pnorbert , @anagainaru This provides a partial, but workable SCR integration into ADIOS2. I'll open the PR so we have some place to discuss these changes.

Most SCR calls are placed in the application. For an application, see examples/hello/bpWriter/helloBPWriter.c first. I started with it and added some comments to describe the SCR interface. When testing restart, I realized that helloBPWriter.c has no corresponding Reader, so I then added SCR calls to helloBPWriter.cpp and helloBPReader.cpp. The SCR calls in the application start up and shut down the SCR library, and they add the start/end bookends that define boundaries of the checkpoint and restart phases.

SCR_Route_file is called from the BP4 engine. This should be called for each physical file that ADIOS writes. SCR manages directories, so I've commented out the mkdir operations for now. During a write phase, SCR_Route_file tells SCR that the file belongs to the active dataset and it provides the path where the file should (eventually) be written to on the parallel file system. As output from this call, SCR provides a path to where the file should be written instead. This output path may be a temporary location like /dev/shm or a node-local SSD depending on how SCR was configured.

I've included a buildme script that will download and build SCR for LSF, e.g. for use on Summit. It then configures ADIOS2 to point to that SCR installation. One may need to modify the module loads at the top depending on your system, but otherwise one might build by setting the execute bit and running:

./buildme

SCR can be configured in different ways, but I prefer to use environment variables for development. I often set the following to verify that SCR calls are being made:

export SCR_DEBUG=1

One can run a test like:

cd build/bin
export SCR_DEBUG=1
jsrun -r2 ./hello_bpWriter_mpi
jsrun -r2 ./hello_bpReader_mpi

By default, SCR uses "cache bypass" so that each file is written directly to the file system. That is, SCR_Route_file just returns the original path that the user wanted to write to. It does not return a temporary path. This "cache bypass" mode should work on any system, including those that do not have sufficient temporary storage. To configure SCR to write to node local storage, disable "cache bypass" mode:

export SCR_CACHE_BYPASS=0

When running with cache enabled, one should see ADIOS files in subdirectories within /dev/shm/ like /dev/shm/$USER/scr.<jobid>/scr.dataset.<id>. The files will have also been copied to the parallel file system when the Writer application exits. By default, SCR flushes any cached checkpoint during SCR_Finalize().

It can be useful to run tests where SCR does not flush. One can use this mode to simulate a restart after a failure, in which case the SCR library may not have had a chance to flush the checkpoint:

export SCR_FLUSH=0

With this, SCR will not flush files during SCR_Finalize().

To demonstrate, delete the directory from the parallel file system and run the Writer again. This time, the files will be in /dev/shm but not on the parallel file system. If one then runs the Reader, the SCR library flushes the files from /dev/shm to the parallel file system during the call to SCR_Init().

This is because I have specified SCR_GLOBAL_RESTART=1 in the Reader code. This tells SCR to rebuild and flush any cached dataset during SCR_Init(). Not all applications need to do this, but for now this is required in order to maintain the directory structure that ADIOS2 expects to see when inspecting the files for reading.

SCR uses /dev/shm as its default cache, since it is fast and available on any Linux cluster. One can specify another location by setting the "cache base", e.g., to point to a node-local SSD:

export SCR_CACHE_BASE=/mnt/ssd

I can reply back with a few tests to run. I think that would be instructive to demonstrate the current functionality. Before doing that, let me know if you are able to build and run the few tests above.

adammoody avatar Jul 23 '22 22:07 adammoody

Update after attempting to run this on Summit. I was unable to install SCR

$ cmake -DCMAKE_INSTALL_PREFIX=/path/to/install -DCMAKE_BUILD_TYPE=Debug -DSCR_RESOURCE_MANAGER=LSF ..

CMake Error at /autofs/nccs-svm1_sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-8.3.1/cmake-3.23.1-ij35dzv4x2ql3uxn2n63ei4qr2uutjtu/share/cmake-3.23/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
  Could NOT find PDSH (missing: PDSH_EXE DSHBAK_EXE)
Call Stack (most recent call first):
  /autofs/nccs-svm1_sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-8.3.1/cmake-3.23.1-ij35dzv4x2ql3uxn2n63ei4qr2uutjtu/share/cmake-3.23/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
  scr/cmake/FindPDSH.cmake:22 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
  scr/cmake/SCR_DEPENDENCIES.cmake:98 (FIND_PACKAGE)
  CMakeLists.txt:24 (INCLUDE)

The buildme routing is trying to load a gcc version that does not exist on Summit (I am currently using 9.1.0)

$ module avail gcc

--------------------------------------------- /sw/summit/spack-envs/base/modules/site/Core ---------------------------------------------
   gcc/7.5.0    gcc/9.1.0 (L,D)    gcc/9.3.0    gcc/10.2.0    gcc/11.1.0    gcc/11.2.0    gcc/12.1.0

----------------------------------------------------- /sw/summit/modulefiles/core ------------------------------------------------------
   hdf5_perf/1.10.6.gcc

Also the /etc/profile.d/z00_lmod.sh file doesn't exist either.

anagainaru avatar Aug 09 '22 14:08 anagainaru

Thanks, @anagainaru . Some of those lines are specific to LLNL systems. I've just pushed a commit to comment those out. I also updated the SCR cmake config to disable pdsh. You shouldn't need that just yet anyway. Can you do a pull and try again?

adammoody avatar Aug 09 '22 19:08 adammoody

I was able to build ADIOS2 with SCR but now I get errors when I try to run it:

SCR v3.0.0 ABORT: rank 1 on b33n15: Failed to record username @ /path/to/scr-v3.0.1/scr/src/scr.c:775
SCR v3.0.0 ABORT: rank 0 on b33n15: Failed to create .scr subdirectory /path/to/scr/build/.scr @ /path/to/scr-v3.0.1/scr/src/scr.c:688

There are two errors listed there.

MPI rank 1 bailed because it failed to find your username. SCR typically pulls this value from the $USER environment variable. Are you launching with jsrun or something else? Does jsrun not propagate your environment to all ranks by default?

MPI rank 0 failed to create a subdirectory. Is your build directory perhaps mounted as a read-only file system from the Summit compute nodes? If so, we could point SCR to GPFS instead. I can describe how if that's the problem.

adammoody avatar Aug 11 '22 17:08 adammoody

I was able to build ADIOS2 with SCR but now I get errors when I try to run it:

SCR v3.0.0 ABORT: rank 1 on b33n15: Failed to record username @ /path/to/scr-v3.0.1/scr/src/scr.c:775
SCR v3.0.0 ABORT: rank 0 on b33n15: Failed to create .scr subdirectory /path/to/scr/build/.scr @ /path/to/scr-v3.0.1/scr/src/scr.c:688

There are two errors listed there.

MPI rank 1 bailed because it failed to find your username. SCR typically pulls this value from the $USER environment variable. Are you launching with jsrun or something else? Does jsrun not propagate your environment to all ranks by default?

MPI rank 0 failed to create a subdirectory. Is your build directory perhaps mounted as a read-only file system from the Summit compute nodes? If so, we could point SCR to GPFS instead. I can describe how if that's the problem.

You are right, second error was my bad. I am not sure why my it fails to find my username. I am using this script:

#!/bin/bash -l
#BSUB -P proj
#BSUB -W 00:01
#BSUB -nnodes 1
#BSUB -J adiosSCR
#BSUB -o scr.out.%J
#BSUB -e scr.out.%J

module load gcc

jsrun -r2 /ccs/home/againaru/adios/ADIOS2-scr/build/bin/hello_bpWriter_mpi

anagainaru avatar Aug 11 '22 19:08 anagainaru

I added the SCR_USER_NAME variable to point to my username in the submission script and now I am able to run the example.

[update] I created tests based on the BP testing and ADIOS2+SCR passes the tests.

@adammoody one question, how do I know that SCR is doing anything? Is there a verbose mode to run it? I tried setting export SCR_FLUSH=0 but I could not see anything in /dev/shm/

anagainaru avatar Nov 23 '22 15:11 anagainaru