scorpio icon indicating copy to clipboard operation
scorpio copied to clipboard

Compile scorpio with adios2, got error on PIO_MAX_CACHED_STEPS_FOR_ADIOS

Open halehawk opened this issue 2 years ago • 49 comments

Hello,

I tried to compile scorpio with adios2 at derecho.ucar.edu, I got the following error: /glade/derecho/scratch/haiyingx/scorpio/src/clib/pioc_support.c:360:58: error: expected expression file->max_step_calls = PIO_MAX_CACHED_STEPS_FOR_ADIOS;

It looks like PIO_MAX_CACHED_STEPS_FOR_ADIOS is set by during cmake as the follows, do I miss anything in the cmake options?

Thanks, Haiying

Here is the partial cmake log: CC=mpicc CXX=mpicxx FC=mpif90 cmake -DNetCDF_C_PATH=/glade/u/apps/derecho/23.06/spack/opt/spack/netcdf/4.9.2/cray-mpich/8.1.25/oneapi/2023.0.0/wzol/
-DNetCDF_Fortran_PATH=/glade/u/apps/derecho/23.06/spack/opt/spack/netcdf/4.9.2/cray-mpich/8.1.25/oneapi/2023.0.0/wzol/
-DPnetCDF_PATH=/glade/u/apps/derecho/23.06/spack/opt/spack/parallel-netcdf/1.12.3/cray-mpich/8.1.25/oneapi/2023.0.0/blyr/
-DWITH_ADIOS2=ON -DADIOS2_DIR=/glade/work/haiyingx/ADIOS2/installintelde/lib64/cmake/adios2
-DCMAKE_INSTALL_PREFIX=/glade/derecho/scratch/haiyingx/scorpio/install
-DPIO_ENABLE_TESTS=ON
..

-- ===== Configuring SCORPIO... ===== -- Enabling SCORPIO I/O performance statistics collection (default) -- Using BGET to allocate memory for caching data in SCORPIO -- Disabling debug logging in SCORPIO (default) -- Disabling use/check of the MPI serial library (default) -- Disabling saving I/O decompositions (default) -- No limit on the number of cached I/O regions (default) -- Limit on the number of Lustre OSTs, PIO_MAX_LUSTRE_OSTS, is not set (default) -- Filesystem striping unit is not set (default) -- Using PnetCDF independent data mode to read variables in SCORPIO (default) -- Reserving some extra space in the header when creating NetCDF files, requested bytes = 10240 (default) -- Setting the maximum number of I/O decompositions registered with ADIOS type to PIO_MAX_ADIOS_DECOMPS = 65536 (default) -- Setting the maximum number of cached application steps for ADIOS type to PIO_MAX_CACHED_STEPS_FOR_ADIOS = 128 (default) -- Disabling code coverage... (use -DPIO_ENABLE_COVERAGE:BOOL=ON to enable coverage, only GNU is supported for now)

halehawk avatar Jan 02 '24 23:01 halehawk

Can you include your complete configure/make log (you can copy it in a github gist and include the link to gist here)?

jayeshkrishna avatar Jan 03 '24 01:01 jayeshkrishna

I found this error caused by the previously compiled libgptl.a of scorpio is reused. So this problem is gone, but after the successful compilation, I got a lot of fails on ctest. So I still share the cmake log here and ctest log. Could you please tell me how I can pass all ctest? Thanks! https://gist.github.com/halehawk/1acad191f06c3d805edf2a6cf9e020b2 LastTest.log

halehawk avatar Jan 03 '24 03:01 halehawk

Can you try launching one of the tests in a batch file (without using ctest, directly using mpiexec/aprun etc)?

It looks like ctest support for this machine (derecho?) is not added in SCORPIO yet. It should be a small patch that we can provide to you for testing that should get the ctest testing working for you. Please try running the tests manually in a batch file and let us know the results (Also include the batch file in the issue).

jayeshkrishna avatar Jan 03 '24 16:01 jayeshkrishna

I requested an interactive node on derecho, ran the ctest and single test as well. Some tests can pass now, but more cannot and even stuck at test #71. LastTest_0103.log

halehawk avatar Jan 03 '24 21:01 halehawk

@halehawk Your ADIOS2 lib is quite old (version 2.7.1) -- Found ADIOS2: /glade/work/haiyingx/ADIOS2/installintelde/lib64/cmake/adios2/adios2-config.cmake (found suitable version "2.7.1.745", minimum required is "2.7.0") found components: C CXX Fortran MPI

For latest ADIOS read support we do require ADIOS2 2.9.0 or higher versions. I think you can install ADIOS2 2.9.1 for testing (we have not tested latest 2.9.2 or 2.10.0-rc1 so far).

FYI, you can install ADIOS2 2.9.1 with the commands below (please change /path/to/your/adios/installation accordingly)

wget https://github.com/ornladios/ADIOS2/archive/refs/tags/v2.9.1.tar.gz
tar zxf v2.9.1.tar.gz

cd ADIOS2-2.9.1

mkdir build
cd build

CC=mpicc CXX=mpicxx FC=mpif90 \
CFLAGS="-g -O2" CPPFLAGS="-g -O2" CXXFLAGS="-g -O2" FCFLAGS="-g -O2" \
cmake \
-DCMAKE_INSTALL_PREFIX=/path/to/your/adios/installation \
-DBUILD_SHARED_LIBS=OFF -DADIOS2_BUILD_EXAMPLES=OFF -DBUILD_TESTING=OFF \
-DADIOS2_USE_Blosc2=OFF \
-DADIOS2_USE_BZip2=OFF \
-DADIOS2_USE_ZFP=OFF \
-DADIOS2_USE_SZ=OFF \
-DADIOS2_USE_MGARD=OFF \
-DADIOS2_USE_PNG=OFF \
-DADIOS2_USE_DataMan=OFF \
-DADIOS2_USE_DataSpaces=OFF \
-DADIOS2_USE_MHS=OFF \
-DADIOS2_USE_ZeroMQ=OFF \
-DADIOS2_USE_HDF5=OFF \
-DADIOS2_USE_Python=OFF \
-DADIOS2_USE_Fortran=OFF \
-DADIOS2_USE_Profiling=OFF \
..

make -j4

make install

dqwu avatar Jan 04 '24 15:01 dqwu

@halehawk It seems that you configured SCORPIO with NetCDF 4.9.2: -- Checking NetCDF version - 4.9.2./*!<

Our nightly builds use an older but more stable NetCDF 4.8.0 lib for testing (higher versions have some known issues for the unit tests). In fact, you do not need NetCDF lib for testing ADIOS type (PnetCDF lib is sufficient).

FYI, you can use "-DWITH_NETCDF=OFF" to configure SCORPIO without NetCDF support (you can safely remove -DNetCDF_C_PATH=XXXX and -DNetCDF_Fortran_PATH=XXXX in this case).

dqwu avatar Jan 04 '24 16:01 dqwu

@halehawk As a sanity check on that machine, I would suggest that you first run the full SCORPIO unit tests with PnetCDF type only (SCORPIO is configured without NetCDF or ADIOS support):

git clone https://github.com/E3SM-Project/scorpio.git
cd scorpio

mkdir build
cd build

CC=mpicc CXX=mpicxx FC=mpif90 cmake -Wno-dev \
-DWITH_NETCDF=OFF \
-DPnetCDF_PATH=/glade/u/apps/derecho/23.06/spack/opt/spack/parallel-netcdf/1.12.3/cray-mpich/8.1.25/oneapi/2023.0.0/blyr \
-DPIO_USE_MALLOC=ON \
-DPIO_ENABLE_TESTS=ON \
-DPIO_ENABLE_EXAMPLES=ON \
..

make -j4

make tests

ctest

PIO_ENABLE_EXAMPLES is set to ON to enable testing some C examples. PIO_USE_MALLOC is set to ON (recommended) to use native malloc (instead of bget package).

dqwu avatar Jan 04 '24 16:01 dqwu

Sure, I will try your last suggestion first.Sent from my iPhoneOn Jan 4, 2024, at 9:24 AM, dqwu @.***> wrote: @halehawk As a sanity check, I would suggest that you run the full SCORPIO unit tests with PnetCDF type only (SCORPIO is configured without NetCDF or ADIOS support): git clone https://github.com/E3SM-Project/scorpio.git cd scorpio

mkdir build cd build

CC=mpicc CXX=mpicxx FC=mpif90 cmake -Wno-dev
-DWITH_NETCDF=OFF
-DPnetCDF_PATH=/glade/u/apps/derecho/23.06/spack/opt/spack/parallel-netcdf/1.12.3/cray-mpich/8.1.25/oneapi/2023.0.0/blyr
-DPIO_USE_MALLOC=ON
-DPIO_ENABLE_TESTS=ON
-DPIO_ENABLE_EXAMPLES=ON
..

make -j4

make tests

ctest

PIO_ENABLE_EXAMPLES is set to ON to enable testing some C examples. PIO_USE_MALLOC is set to ON (recommended) to use native malloc (instead of bget package).

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

halehawk avatar Jan 04 '24 16:01 halehawk

@halehawk ADIOS IO type in SCORPIO has some known limitations, see #553 for detailed information.

For ADIOS type, you should not run all of the unit tests (C and Fortran). In fact, our nightly builds only test a subset of the Fortran unit tests (C unit tests do not support ADIOS type so far), plus C example test_adios.c.

After you install a custom ADIOS 2.9.1 lib on that machine, I would suggest that you test ADIOS write feature first (ADIOS read support has more restrictions).

FYI, below are the commands to test ADIOS write feature:

git clone https://github.com/E3SM-Project/scorpio.git
cd scorpio

mkdir build
cd build

ADIOS2_DIR=/path/to/your/adios2/2.9.1/installation \
CC=mpicc CXX=mpicxx FC=mpif90 cmake -Wno-dev \
-DWITH_NETCDF=OFF \
-DWITH_ADIOS2=ON \
-DADIOS_BP2NC_TEST=ON \
-DPnetCDF_PATH=/glade/u/apps/derecho/23.06/spack/opt/spack/parallel-netcdf/1.12.3/cray-mpich/8.1.25/oneapi/2023.0.0/blyr \
-DPIO_USE_MALLOC=ON \
-DPIO_ENABLE_TESTS=ON \
-DPIO_ENABLE_EXAMPLES=ON \
..

make -j4

make tests

ctest -R "pio_unit_test|^init|pio_file\
|ncdf_get_put|ncdf_inq|ncdf_simple_tests|pio_rearr\
|pio_decomp|pio_sync_tests|pio_buf_lim_tests|pio_iodesc_tests\
|pio_iosystem_tests|examplePio|example1|darray_no_async|test_adios"

Note that we only run a subset of the Fortran unit tests and test_adios example with the ctest command.

Also, ADIOS_BP2NC_TEST is set to ON to perform implicit file conversion (ADIOS to NetCDF) such that we actually read converted .nc files for testing.

dqwu avatar Jan 04 '24 16:01 dqwu

@halehawk For testing ADIOS read feature, we only run test_adios.c example so far (the shorter list of applicable Fortran unit tests has not been determined yet). Simply remove "-DADIOS_BP2NC_TEST=ON" and only run test_adios with the ctest command:

git clone https://github.com/E3SM-Project/scorpio.git
cd scorpio

mkdir build
cd build

ADIOS2_DIR=/path/to/your/adios2/2.9.1/installation \
CC=mpicc CXX=mpicxx FC=mpif90 cmake -Wno-dev \
-DWITH_NETCDF=OFF \
-DWITH_ADIOS2=ON \
-DPnetCDF_PATH=/glade/u/apps/derecho/23.06/spack/opt/spack/parallel-netcdf/1.12.3/cray-mpich/8.1.25/oneapi/2023.0.0/blyr \
-DPIO_USE_MALLOC=ON \
-DPIO_ENABLE_TESTS=ON \
-DPIO_ENABLE_EXAMPLES=ON \
..

make -j4

make tests

ctest -R test_adios

dqwu avatar Jan 04 '24 17:01 dqwu

I tested on pnetcdf with scorpio. Until test#69, all tests passed. LastTest_0104.log

halehawk avatar Jan 04 '24 18:01 halehawk

I tried to compile scorpio with esmf. I included scorpio include path and adios2 include path. But I still got this error: In file included from /glade/.../esmf/src/Infrastructure/IO/src/ESMCI_IO_Handler.C:36: In file included from /glade/.../esmf/src/Infrastructure/IO/src/../include/ESMCI_PIO_Handler.h:45: /glade/.../scorpio/install/include/pio.h:71:10: fatal error: In file included from /glade/.../esmf/src/Infrastructure/IO/src/ESMCI_PIO_Handler.C:'spio_hash.h' file not found27:

It looks like spio_hash.h is not in my scorpio include path, how can I change cmake option to install this header file?

halehawk avatar Jan 10 '24 22:01 halehawk

I tried to compile scorpio with esmf. I included scorpio include path and adios2 include path. But I still got this error: In file included from /glade/.../esmf/src/Infrastructure/IO/src/ESMCI_IO_Handler.C:36: In file included from /glade/.../esmf/src/Infrastructure/IO/src/../include/ESMCI_PIO_Handler.h:45: /glade/.../scorpio/install/include/pio.h:71:10: fatal error: In file included from /glade/.../esmf/src/Infrastructure/IO/src/ESMCI_PIO_Handler.C:'spio_hash.h' file not found27:

It looks like spio_hash.h is not in my scorpio include path, how can I change cmake option to install this header file?

It might take some time for us to fix this issue. If you manually copy spio_hash.h to /glade/.../scorpio/install/include, does it work?

dqwu avatar Jan 10 '24 22:01 dqwu

Yes, this error is gone. Then I got this error /glade/.../esmf/src/Infrastructure/Mesh/src/ESMCI_UGRID_Util.C:357:11: error: use of undeclared identifier 'PIOc_InitDecomp_ReadOnly'

piorc = PIOc_InitDecomp_ReadOnly(pioSystemDesc, PIO_INT,

Is there any macro I should define to avoid this error?

On Wed, Jan 10, 2024 at 3:53 PM dqwu @.***> wrote:

I tried to compile scorpio with esmf. I included scorpio include path and adios2 include path. But I still got this error: In file included from /glade/.../esmf/src/Infrastructure/IO/src/ESMCI_IO_Handler.C:36: In file included from /glade/.../esmf/src/Infrastructure/IO/src/../include/ESMCI_PIO_Handler.h:45: /glade/.../scorpio/install/include/pio.h:71:10: fatal error: In file included from /glade/.../esmf/src/Infrastructure/IO/src/ESMCI_PIO_Handler.C:'spio_hash.h' file not found27:

It looks like spio_hash.h is not in my scorpio include path, how can I change cmake option to install this header file?

It might take some time for us to fix this issue. If you manually copy spio_hash.h to /glade/.../scorpio/install/include, does it work?

— Reply to this email directly, view it on GitHub https://github.com/E3SM-Project/scorpio/issues/554#issuecomment-1885873315, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACAPEFAZC63EKDM745JTX6DYN4LV7AVCNFSM6AAAAABBKVPKWOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBVHA3TGMZRGU . You are receiving this because you were mentioned.Message ID: @.***>

halehawk avatar Jan 10 '24 23:01 halehawk

And this erro /glade/.../esmf/src/Infrastructure/Mesh/src/ESMCI_Mesh_FileIO.C:219:13: error: use of undeclared identifier 'PIOc_free_iosystem';

halehawk avatar Jan 10 '24 23:01 halehawk

@halehawk It seems that PIOc_InitDecomp_ReadOnly is an API of NCAR PIO, which is not supported by SCORPIO.

dqwu avatar Jan 10 '24 23:01 dqwu

ESMCI_Mesh_FileIO.C might need to be updated to use SCORPIO APIs.

dqwu avatar Jan 10 '24 23:01 dqwu

PIOc_InitDecomp_ReadOnly is a read decomp, can Scorpio read nc files directly?

On Wed, Jan 10, 2024 at 4:13 PM dqwu @.***> wrote:

ESMCI_Mesh_FileIO.C might need to be updated to use SCORPIO APIs.

— Reply to this email directly, view it on GitHub https://github.com/E3SM-Project/scorpio/issues/554#issuecomment-1885894943, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACAPEFBYFCKPPEIXUUKRL2DYN4OADAVCNFSM6AAAAABBKVPKWOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBVHA4TIOJUGM . You are receiving this because you were mentioned.Message ID: @.***>

halehawk avatar Jan 10 '24 23:01 halehawk

@halehawk I think you can simply use PIOc_InitDecomp() in ESMCI_Mesh_FileIO.C It has the same parameters as PIOc_InitDecomp_ReadOnly() in NCAR PIO code. SCORPIO does not support PIOc_InitDecomp_ReadOnly() so far.

int PIOc_InitDecomp(int iosysid, int pio_type, int ndims, const int *gdimlen, int maplen,
                    const PIO_Offset *compmap, int *ioidp, const int *rearranger,
                    const PIO_Offset *iostart, const PIO_Offset *iocount)

PIOc_InitDecomp_ReadOnly(int iosysid, int pio_type, int ndims, const int *gdimlen, int maplen,
                const PIO_Offset *compmap, int *ioidp, const int *rearranger,
                const PIO_Offset *iostart, const PIO_Offset *iocount)

dqwu avatar Jan 10 '24 23:01 dqwu

After I replaced PIOc_InitDecomp_ReadOnly with PIOc_InitDecomp. I still have the error: And this erro /glade/.../esmf/src/Infrastructure/Mesh/src/ESMCI_Mesh_FileIO.C:219:13: error: use of undeclared identifier 'PIOc_free_iosystem'; Which API can replace PIOc_free_iosystem? Thanks!

halehawk avatar Jan 11 '24 01:01 halehawk

@halehawk You can use PIOc_finalize() instead of PIOc_free_iosystem().

FYI, NCAR PIO added PIOc_free_iosystem() as a duplicate for PIOc_finalize():

There are some naming inconsistencies in the PIO C API. I would like to fix them by adding new functions with better nomenclature, and leaving old functions alone. But the documentation and examples can use the new consistent nomenclature and that will be good.

There are several examples but I will start with the simplest, PIOc_finalize().

The name is confusing because usually finalize means to shut down the library altogether, releasing all resources. However, in our case it just means to release one IOSystem. If multiple IOSystems are in use, then finalize must be called for each of them.

So a better name would be PIOc_free_iosystem();

dqwu avatar Jan 11 '24 02:01 dqwu

OK, I commented free_iosystem out for continuing compilation. I will change to PIOc_finalize. Now Scorpio static libraries cannot be linked its path to load. Do you know where I can add -fPIC in your CMakeLists.txt to build a shared library? Thanks!

halehawk avatar Jan 11 '24 03:01 halehawk

AFAIK, ESMF currently cannot use SCORPIO (The PIO library added some APIs specifically for ESMF that are currently not available in SCORPIO. It has been in our todo list but hasn't been a high priority task yet.)

jayeshkrishna avatar Jan 11 '24 03:01 jayeshkrishna

@halehawk E3SM uses static scorpio lib without any build issues. I think you should be able to build your application with static scorpio lib as well. @jayeshkrishna Any thoughts on this?

dqwu avatar Jan 11 '24 15:01 dqwu

Using static lib should work fine, however as I mentioned above you cannot use SCORPIO with ESMF right now.

jayeshkrishna avatar Jan 11 '24 15:01 jayeshkrishna

So I just used the scorpio static lib with esmf, and changed the previous two APIs, now at least esmf compiled.

halehawk avatar Jan 11 '24 21:01 halehawk

Though I built scorpio enabled esmf. But when I tried to build this esmf with cesm, I got the following errors: /glade/.../tmp/ifortAc6Qfb.i: error #5286: Ambiguous generic interface PIO_INIT: previously declared specific procedure SPIO_INIT::PIO_INIT_INTRACOMM is not distinguishable from this declaration. [PIOLIB_MOD::INIT_INTRACOM]

/glade/.../tmp/ifortAc6Qfb.i: error #5286: Ambiguous generic interface PIO_SETERRORHANDLING: previously declared specific procedure SPIO_ERR::PIO_SETERRORHANDLING_IOSYS is not distinguishable from this declaration. [PIOLIB_MOD::SETERRORHANDLINGIOSYSTEM]

/glade/.../my_cesm_sandbox1/components/cam/src/dynamics/mpas/dycore/src/framework/mpas_io.F(313): error #6405: The same named entity from different modules and/or program units cannot be referenced. [PIO_OPENFILE]

Is this caused by I included spio header files? If not, how can I use from PIO mode instead of SPIO mod? I looked at the libpiof.a and I only saw: spio_init but not pio_init U PIOc_Init_Intercomm_from_F90 U PIOc_Init_Intracomm_from_F90 0000000000001c40 T spio_init._ 0000000000000c00 T spio_init_mp_pio_finalize_

halehawk avatar Jan 16 '24 20:01 halehawk

SPIO_ERR

You might try configuring scorpio with -DPIO_USE_FORTRAN_LEGACY_LIB=ON

dqwu avatar Jan 16 '24 21:01 dqwu

After I enable -DPIO_USE_FORTRAN_LEGACY_LIB=ON, can I still use PIO_IOTYPE_ADIOS2?

halehawk avatar Jan 16 '24 22:01 halehawk

After I enable -DPIO_USE_FORTRAN_LEGACY_LIB=ON, can I still use PIO_IOTYPE_ADIOS2?

Yes you can.

dqwu avatar Jan 16 '24 22:01 dqwu