llvm
llvm copied to clipboard
[WIP][SYCL][HostTask] Optimize blocked users tracking
This commit partially addresses a performance issue observed when submitting consecutive host tasks to an in-order queue without explicit wait(). The execution time of each host task was found to increase significantly as the number of submissions grew:
https://github.com/intel/llvm/issues/18500.
The major cause was identified as the unnecessary tracking of indirect blocking dependencies in MBlockedUsers. Previously, all direct and indirect blocking relations between enqueued commands were tracked, causing a siginificant increase in notification time upon task completion. For example, in a sequence of tasks A, B, C, D, A.MBlockedUsers would redundantly include {C, D}, even though these tasks are already blocked by B.
To resolve this, the enqueueCommand function was enhanced to include a TrackBlockedUser flag during recursion enqueueing. This change prevents excessive growth in the size of Cmd->MBlockedUsers in long dependency chains by only tracking the host task immediate dominator in the dependency tree, thereby reducing notification time upon command completion.