mpich icon indicating copy to clipboard operation
mpich copied to clipboard

comm: reimplement nonblocking contextid allocation using MPIX Async

Open hzhou opened this issue 4 months ago • 1 comments

Pull Request Description

The nonblocking contextid allocation algorithm currently is implemented using Sched, It requires a few hacks and it is very difficult to debug. Re-implement it using MPIX Async API instead.

NOTE: Hopefully, this will resolve the outstanding test xfails. Now that I understands the algorithm better, if we still encounter lock contention issue, we can try insert heavy yield when we know we are not getting the masks.

[skip warnings]

Author Checklist

  • [x] Provide Description Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
  • [x] Commits Follow Good Practice Commits are self-contained and do not do two things at once. Commit message is of the form: module: short description Commit message explains what's in the commit.
  • [ ] Passes All Tests Whitespace checker. Warnings test. Additional tests via comments.
  • [x] Contribution Agreement For non-Argonne authors, check contribution agreement. If necessary, request an explicit comment from your companies PR approval manager.

hzhou avatar Oct 28 '25 23:10 hzhou

test:mpich/ch3/most test:mpich/ch4/most

✔️

hzhou avatar Nov 03 '25 16:11 hzhou