llvm icon indicating copy to clipboard operation
llvm copied to clipboard

[SYCL] Update time bases when submit time exceeds start time

Open againull opened this issue 7 months ago • 1 comments

Currently for command buffers we set submission time equal to start time if the recorded submit time is greater than the start time. This is needed because the submission time is based on estimated clock and over time device time and host time may drift apart causing aforementioned estimation to be inaccurate. I believe the same approach should be used for all tasks, not only for command buffers. Because currently there is no guarantee that submit_time <= start_time because of the potential clock drift. But that condition must be true according to SYCL specification. So, enable this approach for all tasks and use it as a trigger to query synchronized clock from backend, i.e. to correct the estimation.

PS. accuracy of submission time calculation will be improved here https://github.com/intel/llvm/pull/18735 but it still doesn't guarantee the condition to be satisfied, so we have to use proposed approach.

againull avatar May 29 '25 00:05 againull

I've realized that there is a problem with this approach. To compare submit_time with start_time, we have to query start_time - it becomes available only when command has started executing, so to be able to get it I have to either wait in the loop for start_time to become available or synchronize the event which will block the execution. Not sure if waiting in the loop is acceptable. Blocking at submit_time query seems undesirable. I am not yet sure what other options do we have, if you have any ideas or input, please let me know.

againull avatar Jun 11 '25 05:06 againull

Sorry for the side tracking, but we got the same problem at Argonne in our tracing API (https://github.com/argonne-lcf/THAPI/). We worked with the L0 people. In theory, this extention should take care of drift (or at least will be a L0 problem if they don't : ) )

https://oneapi-src.github.io/level-zero-spec/level-zero/1.11/core/EXT_EventQueryKernelTimestamps.html

TApplencourt avatar Jun 19 '25 22:06 TApplencourt