h5fortran icon indicating copy to clipboard operation
h5fortran copied to clipboard

[Bug]: Builds but fails many (not all) tests

Open mathomp4 opened this issue 1 year ago • 4 comments

What happened?

First, my system is:

  • macOS 13.4
  • GCC 12.2
  • Open MPI 4.1.5
  • HDF5 1.10.10
  • python 3.11.3

This is part of my ongoing attempt to make neural-fortran work with pre-compiled libraries (see https://github.com/modern-fortran/neural-fortran/issues/128 and https://github.com/modern-fortran/neural-fortran/pull/129).

Also, I've tried this with both 4.6.3 (the version neural-fortran points to) and 4.10.2 (the current latest release). All errors below are for 4.10.2, but I had failures with 4.6.3 as well

Now, I have to do some weird things due to how we build HDF5 in our library stack (autotools, static, and then weird install paths on that), but if I add:

-DHDF5_ROOT="$(prefix);$(prefix)/include/hdf5;$(prefix)/include/szlib

to the cmake line, everything seems to be found:

-- Looking for H5_HAVE_FILTER_SZIP
-- Looking for H5_HAVE_FILTER_SZIP - found
-- Looking for H5_HAVE_FILTER_DEFLATE
-- Looking for H5_HAVE_FILTER_DEFLATE - found
-- Looking for H5_HAVE_PARALLEL
-- Looking for H5_HAVE_PARALLEL - found
...
-- Looking for H5Pset_fapl_mpio
-- Looking for H5Pset_fapl_mpio - found
-- Performing Test HDF5_C_links
-- Performing Test HDF5_C_links - Success
-- Performing Test HDF5_Fortran_links
-- Performing Test HDF5_Fortran_links - Success
-- Found HDF5: /Users/mathomp4/Baselibs/ESMA-Baselibs-main-with-neural-fortran/aarch64-apple-darwin22.5.0/gfortran/Darwin/lib/libhdf5_hl.a;/Users/mathomp4/Baselibs/ESMA-Baselibs-main-with-neural-fortran/aarch64-apple-darwin22.5.0/gfortran/Darwin/lib/libhdf5.a (found version "1.10.10") found components: Fortran
...

and a make install runs fine as well.

But on ctest:

Test project /Users/mathomp4/Baselibs/ESMA-Baselibs-main-with-neural-fortran/src/h5fortran/build
      Start  1: minimal
 1/27 Test  #1: minimal ..........................   Passed    0.70 sec
      Start  2: array
 2/27 Test  #2: array ............................***Failed    0.28 sec
      Start  3: attributes
 3/27 Test  #3: attributes .......................***Failed    0.27 sec
      Start 24: PythonAttributes
 4/27 Test #24: PythonAttributes .................   Passed    0.13 sec
      Start  4: attributes_read
 5/27 Test  #4: attributes_read ..................   Passed    0.27 sec
      Start  5: cast
 6/27 Test  #5: cast .............................***Failed    0.28 sec
      Start  6: deflate_write
 7/27 Test  #6: deflate_write ....................***Failed    0.26 sec
      Start  7: deflate_read
Failed test dependencies: deflate_write
 8/27 Test  #7: deflate_read .....................***Not Run   0.00 sec
      Start  8: deflate_props
Failed test dependencies: deflate_write
 9/27 Test  #8: deflate_props ....................***Not Run   0.00 sec
      Start  9: destructor
10/27 Test  #9: destructor .......................   Passed    0.26 sec
      Start 10: exist
11/27 Test #10: exist ............................***Failed    0.26 sec
      Start 11: fill
12/27 Test #11: fill .............................***Failed    0.25 sec
      Start 12: groups
13/27 Test #12: groups ...........................***Failed    0.26 sec
      Start 20: write
14/27 Test #20: write ............................   Passed    0.25 sec
      Start 13: layout
15/27 Test #13: layout ...........................***Failed    0.25 sec
      Start 14: lt
16/27 Test #14: lt ...............................***Failed    0.25 sec
      Start 15: scalar
17/27 Test #15: scalar ...........................   Passed    0.26 sec
      Start 16: shape
18/27 Test #16: shape ............................   Passed    0.25 sec
      Start 17: string
19/27 Test #17: string ...........................***Failed    0.26 sec
      Start 26: PythonString
20/27 Test #26: PythonString .....................   Passed    0.12 sec
      Start 18: string_read
21/27 Test #18: string_read ......................   Passed    0.28 sec
      Start 19: version
22/27 Test #19: version ..........................   Passed    0.25 sec
      Start 21: fail_read_size_mismatch
23/27 Test #21: fail_read_size_mismatch ..........   Passed    0.26 sec
      Start 22: fail_read_rank_mismatch
24/27 Test #22: fail_read_rank_mismatch ..........   Passed    0.25 sec
      Start 23: fail_nonexist_variable
25/27 Test #23: fail_nonexist_variable ...........   Passed    0.25 sec
      Start 25: PythonShape
26/27 Test #25: PythonShape ......................   Passed    0.13 sec
      Start 27: h5ls
27/27 Test #27: h5ls .............................***Not Run (Disabled)   0.00 sec

54% tests passed, 12 tests failed out of 26

Label Time Summary:
h5fortran    =   6.28 sec*proc (27 tests)
python       =   0.38 sec*proc (3 tests)
shaky        =   0.76 sec*proc (3 tests)

Total Test time (real) =   6.29 sec

The following tests did not run:
	 27 - h5ls (Disabled)

The following tests FAILED:
	  2 - array (Failed)
	  3 - attributes (Failed)
	  5 - cast (Failed)
	  6 - deflate_write (Failed)
	  7 - deflate_read (Not Run)
	  8 - deflate_props (Not Run)
	 10 - exist (Failed)
	 11 - fill (Failed)
	 12 - groups (Failed)
	 13 - layout (Failed)
	 14 - lt (Failed)
	 17 - string (Failed)
Errors while running CTest
Output from these tests are in: /Users/mathomp4/Baselibs/ESMA-Baselibs-main-with-neural-fortran/src/h5fortran/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.

I'll paste below the --rerun-failed --output-on-failure output.

Any ideas what is happening? The weird thing is, I'm pretty sure I got this working a few weeks back. I had to move on to other projects, but I had time today and...not working. You can see below that there seem to be a lot of:

ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

errors.

Relevant log output

Test project /Users/mathomp4/Baselibs/ESMA-Baselibs-main-with-neural-fortran/src/h5fortran/build
      Start  2: array
 1/13 Test  #2: array ............................***Failed    0.01 sec
  1  2  3  4
  2  4  6  8
  3  6  9 12
  4  8 12 16
 PASSED: array write
 PASSED: slice read
 PASSED: create dataset and write slice 1D
 PASSED: overwrite slice 1d, stride=1
 PASSED: overwrite slice 1d, no stride
 h5fortran:TRACE:create: deflate: /int32a-2d
 create and write slice 2d, stride=1
 PASSED: slice write
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start  3: attributes
 2/13 Test  #3: attributes .......................***Failed    0.01 sec
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start  5: cast
 3/13 Test  #5: cast .............................***Failed    0.01 sec
OK: cast write
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start  6: deflate_write
 4/13 Test  #6: deflate_write ....................***Failed    0.01 sec
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start  7: deflate_read
Failed test dependencies: deflate_write
 5/13 Test  #7: deflate_read .....................***Not Run   0.00 sec
      Start  8: deflate_props
Failed test dependencies: deflate_write
 6/13 Test  #8: deflate_props ....................***Not Run   0.00 sec
      Start 10: exist
 7/13 Test #10: exist ............................***Failed    0.01 sec
 OK: is_hdf5
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start 11: fill
 8/13 Test #11: fill .............................***Failed    0.01 sec
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start 12: groups
 9/13 Test #12: groups ...........................***Failed    0.01 sec
 OK: HDF5 group
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start 20: write
10/13 Test #20: write ............................   Passed    0.01 sec
      Start 13: layout
11/13 Test #13: layout ...........................***Failed    0.01 sec
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start 14: lt
12/13 Test #14: lt ...............................***Failed    0.01 sec
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize

      Start 17: string
13/13 Test #17: string ...........................***Failed    0.01 sec
 OK: HDF5 string write
 OK: HDF5 string read
ERROR STOP ERROR:h5fortran:open: HDF5 library initialize


8% tests passed, 12 tests failed out of 13

Label Time Summary:
h5fortran    =   0.13 sec*proc (13 tests)

Total Test time (real) =   0.13 sec

The following tests FAILED:
	  2 - array (Failed)
	  3 - attributes (Failed)
	  5 - cast (Failed)
	  6 - deflate_write (Failed)
	  7 - deflate_read (Not Run)
	  8 - deflate_props (Not Run)
	 10 - exist (Failed)
	 11 - fill (Failed)
	 12 - groups (Failed)
	 13 - layout (Failed)
	 14 - lt (Failed)
	 17 - string (Failed)
Errors while running CTest

mathomp4 avatar May 31 '23 14:05 mathomp4

This seems to be a bug with specific version of HDF5 library: 1.10.10 and 1.14.0 so far. I am trying to figure out a solution.

scivision avatar Jul 24 '23 21:07 scivision

I think this was fixed by HDF5 1.14.2, does it work for you currently?

scivision avatar Oct 23 '23 15:10 scivision

I'll try and see. I've been holding off on moving to HDF5 1.14 (as we have some code that assumes 1.10). But maybe this is the time to move forward...

mathomp4 avatar Oct 23 '23 15:10 mathomp4

I added an option to force a local HDF5 build (under the build directory)

cmake -Bbuild -Dfind=no

cmake --build build

that would build HDF5 1.14.2 and then h5fortran

scivision avatar Oct 23 '23 21:10 scivision