[draft][UR] Use urEnqueueNativeCommandExp for enqueue_custom_operation
Use the experimental UR entrypoint urEnqueueNativeCommandExp to implement AdaptiveCPP's AdaptiveCpp_enqueue_custom_operation.
https://github.com/AdaptiveCpp/AdaptiveCpp/blob/develop/doc/enqueue-custom-operation.md
Ping @illuhad . Should this be AdaptiveCpp_enqueue_custom_operation or ACPP_enqueue_custom_operation?
@hdelan Great to see! We've recently renamed extensions to AdaptiveCpp_*, so it is now AdaptiveCpp_enqueue_custom_operation.
https://github.com/AdaptiveCpp/AdaptiveCpp/pull/1477
Brief perf results on AMD MI210 courtesy of @hjabird :
For GROMACS on MI210:
| Version | ADH dodec | BenchMEM |
|---|---|---|
| Control | 84% | 96% |
| ACpp host task (this PR) | 90% | 100% |
| Original host task ext ({add/get}_native_events | 90% | 99% |
% of reference performance.
Ping @MartinWehking @aelovikov-intel @intel/llvm-reviewers-runtime @intel/llvm-reviewers-cuda @intel/sycl-graphs-reviewers
I need to get this merged before the end of week as I'll be on holiday next week before the GitHub cutoff.
Ping @intel/llvm-reviewers-runtime would be great to get a review on this ASAP as we would like to merge today if spec issues all get resolved
I have been asked to merge this before the PI removal is merged. So I think this is good to go @intel/llvm-gatekeepers
@hdelan: I think this PR is still missing approval:
Waiting on code owner review from intel/llvm-reviewers-runtime
Aha thanks @sommerlukas I missed that. Where does it say that?
In that case ping @intel/llvm-reviewers-runtime
Thanks @aelovikov-intel . Ping @intel/llvm-gatekeepers this can be merged