MPI.jl icon indicating copy to clipboard operation
MPI.jl copied to clipboard

oneAPI-aware MPI

Open michel2323 opened this issue 4 months ago • 4 comments

I can only test this on Aurora at ANL. I could provide an MPICH_jll with support. How should I proceed?

michel2323 avatar Sep 09 '25 15:09 michel2323

I have no idea what is failing here. Anyhow, before I add MPI.has_oneapi(), I want to leave all this to the user. It works on Aurora, so there's that. Can we merge this?

michel2323 avatar Dec 03 '25 19:12 michel2323

The failure looks to be unrelated. Is is possible for you to test your changes by adding a buildkite pipeline? For an example, you can see how oneAPI.jl runs their tests: https://github.com/JuliaGPU/oneAPI.jl/blob/a00fad6d0532ab7548f236b9293e9dac5845fd0e/.buildkite/pipeline.yml

lcw avatar Dec 03 '25 19:12 lcw

I've now tried building MPICH with ze support, and I'm running into all sorts of issues I shouldn't have to.

I've added a has_oneapi() method, which returns true only if the environment variable is explicitly set. At this point, I'd expect the user to explicitly want to be in for a ride. I'll discuss an MPICH build with the MPICH team here at ANL.

michel2323 avatar Dec 04 '25 17:12 michel2323

Sounds good! Let us know what the MPICH team says. I am happy to merge if building a oneAPI-aware MPICH is too much trouble.

lcw avatar Dec 04 '25 18:12 lcw