hwloc icon indicating copy to clipboard operation
hwloc copied to clipboard

Level Zero Sysman Backport for hwloc v2.11

Open servesh opened this issue 10 months ago • 6 comments

Hi, It would be useful to back level zero fixes related to sysman to v2.11 branch. Since there is a dependency on this version and MPICH releases.

I did my best to backport the patches from master to v2.11 branch and tested it on Aurora. Let me know if a PR on this is helpful or if there are other plans to backport these fixes.

https://github.com/servesh/hwloc/tree/v2.11-level-zero-fix

servesh avatar Jan 30 '25 19:01 servesh

Hello. Do you need a proper release, or do you just need a v2.11 branch with those fixes? I am not planning to backport these intrusive changes to an official 2.11.3 but rather release a 2.12 (multiple hurdles have been delaying this release for dumb reasons but hopefully in a couple weeks).

bgoglin avatar Jan 30 '25 20:01 bgoglin

@bgoglin Its fine if you can include them in the v2.11 branch. We can adopt it in 2.12 whenever its available.

servesh avatar Feb 03 '25 21:02 servesh

I pushed a v2.11-mpich branch with 2.11.2 + all backported fixed pending in branch v2.11 + levelzero backports from v2.12. Can you test it? I am preparing v2.12, rc1 will likely be out tomorrow. Let me know when you start/stop using the v2.11-mpich branch so that I know when you keep/destroy it.

bgoglin avatar Feb 10 '25 16:02 bgoglin

@bgoglin Thanks. It will take me a few weeks to integrate this into our build/testing at scale. So will take some time to report back. I will continue to use the v2.11-mpich branch until I can confirm that v2.12 works fine with mpich.

servesh avatar Feb 11 '25 16:02 servesh

It is ok to rebase this branch when I backport some changes in the official 2.11 ? Or would you rather get merges and fast-forward pulls?

bgoglin avatar Feb 12 '25 14:02 bgoglin

@bgoglin Rebase or whichever easier is fine. All I'm looking for is a branch/release which has the level zero fixes in hwloc and that works with MPICH. If v2.12 hwloc works with mpich then I wouldn't have to trouble you with maintaining this branch.

servesh avatar Feb 13 '25 17:02 servesh