PeleLMeX icon indicating copy to clipboard operation
PeleLMeX copied to clipboard

Ahead-of-time compilation with SYCL (SYCL_AOT) on Intel GPUs

Open ThomasHowarth opened this issue 10 months ago • 5 comments

I've started working on a pure Intel machine, and have been trying to use the SYCL compile options. A couple of things I've noticed is:

  1. Not using AOT compilation and specifying the architecture significantly slows the code down.
  2. If I use AOT compilation as specific by the AMReX documentation, SUNDIALS crashes during runtime - I'm not sure if the CMake options currently specified pass on any of the AOT options.
  3. This is probably more of a direct AMReX issue, but turning on SYCL_AOT and specifying the architecture only works if I do it in the top GNUMakefile, and isn't caught in the Make.machine file I've created for this specific machine, which seems like a more logical place to specify this.

ThomasHowarth avatar Jan 09 '25 12:01 ThomasHowarth

Hopefully I can comment on more of your points later, but I have thoughts on 1. Without AOT are you sure it's not just the first kernel invocations that are slow? SYCL uses JIT, so the first 1 or 2 time steps should be slow without AOT since it's compiling the kernels at runtime, but then it should be fine. The problem I have seen is when we have large chemistry mechanisms, compile time is prohibitive in general. Even using AOT I've waited hours when compiling only to give up not knowing when it would complete. However, I have not tried this within the last year so I hoped Intel's compilers would get better.

jrood-nrel avatar Jan 09 '25 15:01 jrood-nrel

Ah, I see! Alarm bells were ringing when it seemed very slow at the start, and I was being impatient. I tested with JIT and past the first time step it was much faster. image As for the compile time with AOT, it was pretty reasonable for drm19 (comparable with CUDA compiling that I've done on a different machine), so perhaps they are better now. I haven't tested a larger mechanism, but I also didn't use the SYCL_PARALLEL_LINK_JOBS parameter.

Thanks for the help!

ThomasHowarth avatar Jan 09 '25 15:01 ThomasHowarth

@jrood-nrel Should I leave this issue open, in case you would like to pass on AOT flags to SUNDIALS, or should I close it?

ThomasHowarth avatar Jan 17 '25 13:01 ThomasHowarth

We should probably keep it open. We just don't have access to any Intel GPU machines at the moment so I can't test anything.

jrood-nrel avatar Jan 17 '25 15:01 jrood-nrel

@ThomasHowarth when getting the code to work on intel GPUs did you see anything like what is described in #552? What machine and compiler versions were you using?

baperry2 avatar Aug 13 '25 16:08 baperry2