ompi icon indicating copy to clipboard operation
ompi copied to clipboard

HDF5 Info test fails with ompi-main

Open edgargabriel opened this issue 1 year ago • 17 comments

Background information

What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)

main developer branch

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

regular ./configure flags

Please describe the system on which you are running

  • Operating system/version: Linux
  • Computer hardware: irrelevant
  • Network type: irrelevant

Details of the problem

Open MPI main is failing an hdf5 test that is testing Info object handling. The test has been verified to pass on 5.0.3, so its really main that is effected.

testing  -- fapl_mpio duplicate (mpiodup) 
Proc 3: *** Parallel ERROR ***
    VRFY (new and old nkeys equal) failed at line  104 in t_ph5basic.c

The error is triggered by an unexpected value returned by MPI_Info_get_nkeys

The test has been confirmed to fail with hdf5-1.14.2 and hdf5-1.14.4.3. To reproduce the error:

  • compile Open MPI main branch
  • compile hdf5 with
    • ./configure --enable-parallel --prefix=<choose_your_location>
    • make -j
    • make install
  • run the testphdf5
    • cd testpar
    • mpirun -np 4 ./testphdf5

edgargabriel avatar Aug 10 '24 22:08 edgargabriel

@devreal since we talked about this issue offline, I will assign it to you. Please don't hesitate to ping me if you need any help!

edgargabriel avatar Aug 10 '24 22:08 edgargabriel

From what I understand, the test does the following:

  1. Create an info object and set the key "hdf_info_name", asking for the number of info keys in the info object, and associating this info with a comm.
  2. Then duplicating the comm and asking again for the number of info keys associated with the duplicated comm, throwing an error if the number of info keys associated with the new comm is not the same as the number of keys set in the original info object.

This test is incorrect. Since MPI 4, we are not allowed to return info keys that we do not handle. Section 7.4.4 says:

An MPI implementation is required to return all hints that are supported by the implementation and have default values specified; any user-supplied hints that were not ignored by the implementation; and any additional hints that were set by the implementation. If no such hints exist, a handle to a newly created info object is returned that contains no key/value pair.

We ignore the "hdf_info_name" info key (since we don't know it) so we are not allowed to return it in the queried info object.

Info objects are not meant to associate user information with MPI objects. Attributes should be used instead. The rationale for that is that it gives users a chance to query whether user-provided info keys are supported by the implementation.

devreal avatar Aug 11 '24 22:08 devreal

@devreal ok, thank you! I think we should probably file a ticket with the HDF5 folks in that case, will ping you in a bit about that.

edgargabriel avatar Aug 12 '24 13:08 edgargabriel

We had yesterday another exchange with @devreal regarding this issue, the test passes with ompi 5.0.3 but fails with the 5.0.x branch, so probably something in the April commit that touched the Info objects (https://github.com/open-mpi/ompi/pull/12529) impacted this test. We need to clarify whether the behavior is indeed intended, before we file the issue with HDF5 group

edgargabriel avatar Aug 13 '24 15:08 edgargabriel

Hi all, has there been any recent discussion on whether HDF5 is doing something incorrect? I had previously been using version 5.0.3 of OpenMPI, but we received an issue report recently with OpenMPI 5.0.4, so I came looking here. I looked at section 7.4.4 of the 4.1 standard document, but couldn't find the section of text that @devreal mentioned; I was likely just looking in the wrong place though.

Currently, the only place where HDF5 uses a communicator attribute is to register our library termination function, H5_term_library(), on MPI_COMM_SELF so that it gets called when MPI_Finalize() is invoked. Otherwise, we use MPI Info objects for passing around arbitrary user-provided hints and have been making assumptions around those being preserved.

jhendersonHDF avatar Sep 06 '24 17:09 jhendersonHDF

@jhendersonHDF the text is in Chapter 7.4. 4 in MPI4.1 as well, its in the description of MPI_Comm_get_info (bottom of page 342).

I think we are dealing here with two potentially separate issues:

  1. is the test in HDF5 test-suite compliant with the MPI4.x specification?
  2. why did the fix to MPI Info object handling that was introduced in Open MPI 5.0.4 suddenly break this test? Based on my offline discussion with @devreal it should not have had an impact, i.e. the test should have passed either both before and after the fix, or in neither one of the cases.

edgargabriel avatar Sep 06 '24 18:09 edgargabriel

@jhendersonHDF could you help us identify the urgency of this issue? Is HDF5 with Open MPI 5.0.4 and newer actually broken from the functional perspective, are some aspects broken, or is mostly just the test impacted?

edgargabriel avatar Sep 11 '24 12:09 edgargabriel

@edgargabriel I don't think I'd necessarily call this issue urgent. The only functionality in HDF5 that appears to be broken is the ability to retrieve user-supplied hints from an Info object stored on an HDF5 File Access Property List after calling H5Pget_fapl_mpio. For the most part, it seems like users rely on this for just the hints that an MPI implementation would be paying attention to. However, there may be legacy use cases for user-specific hints. It also seems like it would simplify code a bit if an application can rely on getting their hint back, even if the MPI implementation ignored it.

It seems that HDF5 has long assumed that the Info object could be used for arbitrary data that would be propagated between communicators. If that's changed with the MPI 4 standard, it seems like something we should fix on our end. To keep the same behavior, the library would likely just store a backup of all the hints and replace as necessary. Though, it would definitely be nice to get clarification on whether we're compliant with the standard or not before making those changes since there are a few different places in the library to be considered. The text in the standard referenced here is under the MPI_Comm_get_info function, but I'm assuming that this applies to other functions such as MPI_File_get_info as well?

jhendersonHDF avatar Sep 11 '24 17:09 jhendersonHDF

@jhendersonHDF thank you: you raised a very valid point whether the MPI spec is consistent in the Info object handling for Communicators and Files. I double checked, MPI 4.1 has exactly the same language for Files as well, its in section 14.2.8 on page 648.

Beyond this, we should probably confirm with the MPICH/ROMIO folks that there reading of the standard is the same as ours. This should clarify whether HDF5 would need to make an adjustment, or us.

edgargabriel avatar Sep 11 '24 18:09 edgargabriel

MPI 4.1 Chapter 10 Page 473 Line 21.5 says

An implementation must support info objects as caches for arbitrary (key,value) pairs, regardless of whether it recognizes the key.

Whoever changed Chapter 7 to contradict this made a grievous error.

jeffhammond avatar Sep 12 '24 08:09 jeffhammond

Nobody changed Chapter 7 recently to make such a change and as indicated by @edgargabriel files have the same behavior, which indicates that the MPI standard has been consistent with the use of MPI_Info. Moreover, I don't think this is contradictory to the MPI_Info description in Chapter 10, as here it describes the generic aspect of an MPI_Info, aka. being a generic (key, value) store while manipulated as an object by itself. Users should be able to pass such an object around, and cache whatever they need into it. However, once such an MPI_Info object is attached to another MPI object, the (key, value) tuples get selectively copied into the other MPI object.

To summarize, MPI_Info is a generic storage but communicator and files are not, and OMPI exhibits the behavior specified by the MPI standard.

bosilca avatar Sep 13 '24 16:09 bosilca

We're planning to discuss this issue at the next I/O WG meeting, on Oct. 3, at 2:30pm CT. Probably will be @edgargabriel , @devreal , Neil Fortner, and myself, at least. Would anyone else like specifics for the WG meeting so that they can attend?

qkoziol avatar Sep 19 '24 20:09 qkoziol

I took a look at the code and it looks like the HDF5 test is only using MPI_Info_dup(). MPI_Comm_get_info() does not occur in the HDF5 source, and there are no files in that test. After calling MPI_Info_dup(), the test expects to get arbitrary user-set info keys back. MPI 4.1 Chapter 10 Page 477 Line 45 says:

MPI_INFO_DUP duplicates an existing info object, creating a new object, with the same ( key, value) pairs and the same ordering of keys.

Which to me, together with the line @jeffhammond posted, suggests it should copy all keys, even ones it doesn't recognize.

Interestingly, HDF5 notes the behavior @bosilca mentioned with respect to MPI_File_get_info() in H5FD_mpio_open() (H5FDmpio.c:884):

Copy hints in info_used into info. Note hints in info_used supersede info. There may be some hints set and used by HDF5 only, but not recognizable by MPI-IO. We need to keep them, as MPI_File_get_info() will remove any hints unrecognized by MPI-IO library underneath.

fortnern avatar Sep 19 '24 20:09 fortnern

@qkoziol please keep us informed on the forum discussion, to have a basis on moving this forward one way or another.

bosilca avatar Sep 20 '24 18:09 bosilca

I've confirmed the following standalone test program fails on openmpi 5.0.5, on the line "Number of keys on double duplicated MPI info object does not match that on original". It requires the info to be duplicated twice to fail.

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>

#define ERROR(msg) \
do { \
    printf(msg "\n"); \
    exit(1); \
} while(0)

int main(void) {
    MPI_Info info1 = MPI_INFO_NULL, info2 = MPI_INFO_NULL, info3 = MPI_INFO_NULL;
    int nkeys1, nkeys2, nkeys3;

    if (MPI_Info_create(&info1) != MPI_SUCCESS)
        ERROR("MPI_Info_create failed");

    if (MPI_Info_set(info1, "custom_key", "custom_value") != MPI_SUCCESS)
        ERROR("MPI_Info_set failed");

    if (MPI_Info_get_nkeys(info1, &nkeys1) != MPI_SUCCESS)
        ERROR("MPI_Info_get_nkeys(info1, &nkeys1) failed");

    if (MPI_Info_dup(info1, &info2) != MPI_SUCCESS)
        ERROR("MPI_Info_dup failed");

    if (MPI_Info_free(&info1) != MPI_SUCCESS)
        ERROR("MPI_Info_free(&info1) failed");

    if (MPI_Info_get_nkeys(info2, &nkeys2) != MPI_SUCCESS)
        ERROR("MPI_Info_get_nkeys(info2, &nkeys2) failed");

    if (nkeys1 != nkeys2)
        ERROR("Number of keys on duplicated MPI info object does not match that on original");

    if (MPI_Info_dup(info2, &info3) != MPI_SUCCESS)
        ERROR("MPI_Info_dup failed");

    if (MPI_Info_free(&info2) != MPI_SUCCESS)
        ERROR("MPI_Info_free(&info2) failed");

    if (MPI_Info_get_nkeys(info3, &nkeys3) != MPI_SUCCESS)
        ERROR("MPI_Info_get_nkeys(info3, &nkeys3) failed");

    if (nkeys1 != nkeys3)
        ERROR("Number of keys on double duplicated MPI info object does not match that on original");

    if (MPI_Info_free(&info3) != MPI_SUCCESS)
        ERROR("MPI_Info_free(&info3) failed");

    printf("test passed\n");

    return 0;
}

fortnern avatar Oct 03 '24 22:10 fortnern

For the record: the test case provided by @fortnern is a different issue from the original test case (which relied on MPI returning unknown info keys on a communicator). #12847 provides a fix for this latest issue.

devreal avatar Oct 08 '24 13:10 devreal

For the record: the test case provided by @fortnern is a different issue from the original test case (which relied on MPI returning unknown info keys on a communicator). #12847 provides a fix for this latest issue.

By "original test case" do you mean the HDF5 test failure in this issue description? HDF5 never retrieves the info from a communicator, so I'm not sure how that is relevant. Also, @edgargabriel has confirmed #12847 fixes the HDF5 test failure.

fortnern avatar Oct 09 '24 16:10 fortnern