unified-runtime icon indicating copy to clipboard operation
unified-runtime copied to clipboard

Consider adding forward progress device info enums

Open aarongreig opened this issue 1 year ago • 6 comments

A set of new device info enums was added to PI recently to support the forward progress extension. These do not need adapter implementations, but they will be removed as part of the PI -> UR port work, so we need to consider whether they're worth adding to UR now for a future adapter implementation (as suggested by this comment).

aarongreig avatar May 20 '24 16:05 aarongreig

@steffenlarsen @lbushi25 I'd be interested in your informed opinions about this (I don't know anything about the extension in question)

aarongreig avatar May 20 '24 16:05 aarongreig

Hi @aarongreig, I do think we should have UR equivalents for these enums for future work. Tagging @steffenlarsen for more input in case he missed the first ping.

lbushi25 avatar May 21 '24 20:05 lbushi25

Eventually the forward progress query logic should be moved to the backends. I am open to alternatives to the design, but I am definitely in favor of adding it to UR.

steffenlarsen avatar May 22 '24 12:05 steffenlarsen

Call 2024-05-22:

  • In the immediate term these would ideally be added to UR to allow PI to be replaced, but not strictly required as already hard-coded in SYCL-RT.
  • In the medium term the expectation for the SYCL extension would allow different devices to report different forward progress guarantees.
  • Can OpenCL adapter query the OpenCL device for this? Only for sub-groups, not for work groups or work items. The current SYCL-RT hardcodes expected value for OpenCL CPUs.

alycm avatar May 22 '24 14:05 alycm

In case nobody has started working on this, I'm thinking of getting started on this task of moving the forward progress query logic to the UR layer. That said, I do intend to keep hardcoding the logic in these layers based on what we expect to see from certain devices as outlined in the very end of the relevant extension doc here. This hardcoded logic is intended to be a placeholder until people more familiar with the respective API's such as OpenCL, L0, CUDA will be able to query the device itself for this info. Futhermore, it will simplify testing from the SYCL side. Therefore, can someone internal assign this to me?

lbushi25 avatar May 23 '24 19:05 lbushi25

no one's started on it to my knowledge so I've assigned you the ticket, thanks for taking it on

aarongreig avatar May 24 '24 08:05 aarongreig