unified-runtime
unified-runtime copied to clipboard
Support UR program creation from multiple device binaries
To support multi-device AOT scenario in SYCL we need an ability to create UR program from multiple device binaries.
Changes in this PR:
- Modify the core function
urProgramCreateWithBinaryto support program creation from multiple device binaries. - Add methods to
ur_programto get/set per-device data like L0 module handle, build log etc. Otherwise any change in the structure of the class requires multiple changes in all UR functions which work withur_program, in addition to this it allows to handle interop case (which implementation is an exception right now) in a single place. Also make State and some other info per-device because it is possible that UR program is associated with multiple devices andurProgramBuildExpis called multiple times for subsets, and we have to know the state per-device.
Corresponding intel/llvm PR with E2E tests: https://github.com/intel/llvm/pull/15546 Also outlined minimal changes just to uplift UR tag: https://github.com/intel/llvm/pull/15637
There's an ongoing effort to remove this extension by making it the default behaviour for the affected entry points https://github.com/oneapi-src/unified-runtime/pull/1195 as it was a workaround for an issue discovered late in the last release cycle. Since it seems like there's a compelling use case I think this should be a change to urProgramCreateWithBinary, maybe with a device info flag for adapters report whether they support creating binaries for more than one device at a time
There's an ongoing effort to remove this extension by making it the default behaviour for the affected entry points #1195 as it was a workaround for an issue discovered late in the last release cycle. Since it seems like there's a compelling use case I think this should be a change to
urProgramCreateWithBinary, maybe with a device info flag for adapters report whether they support creating binaries for more than one device at a time
Thank you, I've changed the PR to modify the core urProgramCreateWithBinary instead.
@againull looks like there are conformance tests which still need updated. Some conformance tests are only enabled when the UR_DPCXX CMake variable points at clang++ from an intel/llvm build as they depend on generating program binaries.
@againull looks like there are conformance tests which still need updated. Some conformance tests are only enabled when the
UR_DPCXXCMake variable points atclang++from an intel/llvm build as they depend on generating program binaries.
Oh, I see, thanks, I will check and fix that.
@againull looks like there are conformance tests which still need updated. Some conformance tests are only enabled when the
UR_DPCXXCMake variable points atclang++from an intel/llvm build as they depend on generating program binaries.
Fixed remaining failures.
Sorry, I was formatting using .clang-format in unified-runtime and latest clang-format, it turns out cppformat target needs to be used together with clang-format-15 to get the correct formatting, fixed that.
Other than that only e2e tasks are failing because at least these changes https://github.com/intel/llvm/pull/15637 are required in intel/llvm to use new function signature.
Can I ask UR team to take a look at this PR please. Most of the changes are related to the L0 adapter, so it would be nice if @pbalcer or @nrspruit could provide a review.
So far, everything looks good to me, however, @againull can you create a devops branch to kick off testing with multi devices in intel/llvm? Since this is specifically for multi device, then running on PVC for the e2e tests would be ideal. By default I don't think the pre-commit runs on PVC or DG2 or anything other than GEN12 for the most part.
So far, everything looks good to me, however, @againull can you create a devops branch to kick off testing with multi devices in intel/llvm? Since this is specifically for multi device, then running on PVC for the e2e tests would be ideal. By default I don't think the pre-commit runs on PVC or DG2 or anything other than GEN12 for the most part.
Ok, sure, I will invoke such testing.
By default I don't think the pre-commit runs on PVC or DG2 or anything other than GEN12 for the most part.
just fyi - we have a multi-device PVC system in UR CI. You just need to add tests, which this doesn't.
By default I don't think the pre-commit runs on PVC or DG2 or anything other than GEN12 for the most part.
just fyi - we have a multi-device PVC system in UR CI. You just need to add tests, which this doesn't.
@nrspruit @pbalcer I have enabled existing "program" conformance test (https://github.com/oneapi-src/unified-runtime/pull/2147/commits/f3fa464c8e7ff3f873251e30a2e08c9f84edafe0), as well as added new multi-device test (https://github.com/oneapi-src/unified-runtime/pull/2147/commits/88fb72aed96f657fba6dff1982ee2070ca720590) to verify urProgramCreateWithBinary.
@oneapi-src/unified-runtime-level-zero-write @oneapi-src/unified-runtime-native-cpu-write @oneapi-src/unified-runtime-command-buffer-write Could you please take a look.
@oneapi-src/unified-runtime-level-zero-write @oneapi-src/unified-runtime-native-cpu-write @oneapi-src/unified-runtime-command-buffer-write Could you please take a look.
I can't see any files owned by @oneapi-src/unified-runtime-command-buffer-write in this PR diff, I assume the change has been refactored in some way to avoid touching these files since this reviewer group was requested.
@oneapi-src/unified-runtime-maintain Could you please merge this PR.