mpich icon indicating copy to clipboard operation
mpich copied to clipboard

hydra: using HASH_REPLACE_STR instead of HASH_ADD_STR

Open Jayyee-HPC opened this issue 4 years ago • 4 comments

Pull Request Description

Currently PMI server adds KVSs to the hash table without checking if same key exsiting, which may cause undefined behaviors since same keys are not allowed in Uthash.

A stable way to reproduce related bugs.

    char kvsname[KVSNAMELEN];
    char kvs_key[MAXKEYLEN];
    char value[MAXVALLEN];
    char get_value[MAXVALLEN];
    uint64_t key = 1001;
    uint64_t val = key;

    PMI_KVS_Get_my_name(kvsname, KVSNAMELEN);
    for (int i = 0; i < 101; ++i)
    {
        val++;
        char kvs_key[MAXKEYLEN];
        sprintf(kvs_key, "%.16"PRIx64"", key);
        sprintf(value, "%.16"PRIx64"", val);
        PMI_KVS_Put(kvsname, kvs_key, value);
        PMI_Barrier();
        PMI_KVS_Get(kvsname, kvs_key, get_value, MAXVALLEN);
        printf("P%d itr %d put %s get %s\n", comm_ptr->rank, i, value, get_value);
    }

Two issues can be found: 1.Using this test case, the same key with different values can be pushed without any warning. 2.In cases using 2 processes, the 10th and 100th iteration, the values are lost.

Author Checklist

  • [x] Provide Description Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
  • [x] Commits Follow Good Practice Commits are self-contained and do not do two things at once. Commit message is of the form: module: short description Commit message explains what's in the commit.
  • [ ] Passes All Tests Whitespace checker. Warnings test. Additional tests via comments.
  • [x] Contribution Agreement For non-Argonne authors, check contribution agreement. If necessary, request an explicit comment from your companies PR approval manager.

Jayyee-HPC avatar Jun 15 '21 15:06 Jayyee-HPC

Whitespace checker is for each commit. You need to squash the commits in order to pass.

hzhou avatar Jun 15 '21 16:06 hzhou

Whitespace checker is for each commit. You need to squash the commits in order to pass.

Thanks. The Whitespace checker is so frustrating.

Jayyee-HPC avatar Jun 15 '21 16:06 Jayyee-HPC

What is the use-case for replacing values in the KVS? FYI, this type of behavior is not allowed by PMI1 and PMI2, though as you found, Hydra does not explicitly check for duplicate keys. It would be good to understand what you are trying to enable.

raffenet avatar Jun 15 '21 17:06 raffenet

What is the use-case for replacing values in the KVS? FYI, this type of behavior is not allowed by PMI1 and PMI2, though as you found, Hydra does not explicitly check for duplicate keys. It would be good to understand what you are trying to enable.

It's for heartbeat implementation. The heartbeat needs to update its status.

hzhou avatar Jun 15 '21 17:06 hzhou

Picked in PR #6564

hzhou avatar Jun 21 '23 17:06 hzhou