llvm icon indicating copy to clipboard operation
llvm copied to clipboard

[WIP][SYCL][HostTask] Optimize blocked users tracking

Open Nuullll opened this issue 5 months ago • 2 comments

This commit partially addresses a performance issue observed when submitting consecutive host tasks to an in-order queue without explicit wait(). The execution time of each host task was found to increase significantly as the number of submissions grew: https://github.com/intel/llvm/issues/18500.

The major cause was identified as the unnecessary tracking of indirect blocking dependencies in MBlockedUsers. Previously, all direct and indirect blocking relations between enqueued commands were tracked, causing a siginificant increase in notification time upon task completion. For example, in a sequence of tasks A, B, C, D, A.MBlockedUsers would redundantly include {C, D}, even though these tasks are already blocked by B.

To resolve this, the enqueueCommand function was enhanced to include a TrackBlockedUser flag during recursion enqueueing. This change prevents excessive growth in the size of Cmd->MBlockedUsers in long dependency chains by only tracking the host task immediate dominator in the dependency tree, thereby reducing notification time upon command completion.

Nuullll avatar May 16 '25 05:05 Nuullll