llvm icon indicating copy to clipboard operation
llvm copied to clipboard

[SYCL][Graph] Leak in host task in order dependency test on Windows Gen12

Open mmichel11 opened this issue 1 month ago • 2 comments

Describe the bug

First identified in: https://github.com/intel/llvm/actions/runs/19515665783/job/55869141453?pr=20690.

The newly added test host_task_in_order_dependency.cpp leaks L0 objects. This issue has only shown up on gen12 in testing and the test is being marked as XFAIL.

Note that I reproduced the leak with icx 2025.2.0, so the issue predates the PR which added the test.

To reproduce

Compile the test host_task_in_order_dependency.cpp or run with llvm-lit.

Manual run on Windows Gen12:

C:\...>.\in_order_host_task_dependency.exe
Check balance of create/destroy calls
----------------------------------------------------------
               zeContextCreate = 1     \--->              zeContextDestroy = 0     ---> LEAK = 1
          zeCommandQueueCreate = 1     \--->         zeCommandQueueDestroy = 1
                zeModuleCreate = 1     \--->               zeModuleDestroy = 0     ---> LEAK = 1
                zeKernelCreate = 1     \--->               zeKernelDestroy = 0     ---> LEAK = 1
             zeEventPoolCreate = 1     \--->            zeEventPoolDestroy = 0     ---> LEAK = 1
  zeCommandListCreateImmediate = 1     |
           zeCommandListCreate = 1     \--->          zeCommandListDestroy = 0     ---> LEAK = 2
                 zeEventCreate = 1     \--->                zeEventDestroy = 1
                 zeFenceCreate = 1     \--->                zeFenceDestroy = 1
                 zeImageCreate = 0     |
          zeImageViewCreateExt = 0     \--->                zeImageDestroy = 0
               zeSamplerCreate = 0     \--->              zeSamplerDestroy = 0
              zeMemAllocDevice = 2     |
                zeMemAllocHost = 3     |
              zeMemAllocShared = 4     \--->                     zeMemFree = 8     ---> LEAK = 1

Passing run linux Gen12:

$ UR_L0_LEAKS_DEBUG=1 ./in_order_host_dependency
Check balance of create/destroy calls
----------------------------------------------------------
               zeContextCreate = 1     \--->              zeContextDestroy = 1
          zeCommandQueueCreate = 1     \--->         zeCommandQueueDestroy = 1
                zeModuleCreate = 1     \--->               zeModuleDestroy = 1
                zeKernelCreate = 1     \--->               zeKernelDestroy = 1
             zeEventPoolCreate = 1     \--->            zeEventPoolDestroy = 1
  zeCommandListCreateImmediate = 1     |
           zeCommandListCreate = 1     \--->          zeCommandListDestroy = 2
                 zeEventCreate = 1     \--->                zeEventDestroy = 1
                 zeFenceCreate = 1     \--->                zeFenceDestroy = 1
                 zeImageCreate = 0     |
          zeImageViewCreateExt = 0     \--->                zeImageDestroy = 0
               zeSamplerCreate = 0     \--->              zeSamplerDestroy = 0
              zeMemAllocDevice = 2     |
                zeMemAllocHost = 3     |
              zeMemAllocShared = 4     \--->                     zeMemFree = 9

Environment

  • Windows
  • Intel GPU Gen 12
  • Produced the issue with as early as icpx 2025.2
  • Driver version 1.6.35096. Level-zero backend should be used to check for leaks with UR_L0_LEAKS_DEBUG=1

Additional context

No response

mmichel11 avatar Nov 20 '25 03:11 mmichel11

Leak was fixed after sync with main sycl branch and xfail was removed. 2025.2 failure is also expected behavior without setting SYCL_ENABLE_DEFAULT_CONTEXTS=0. Once enabled, this test passes.

Edit: see below.

mmichel11 avatar Dec 01 '25 22:12 mmichel11

After passing in several CI runs, newer run shows the failure reappearing suggesting the test is flaky:

  # |           1: Check balance of create/destroy calls 
  # |           2: ---------------------------------------------------------- 
  # |           3:  zeContextCreate = 1 \---> zeContextDestroy = 0 ---> LEAK = 1 
  # | not:imp1                                                          !~~~      error: no match expected
  # |           4:  zeCommandQueueCreate = 1 \---> zeCommandQueueDestroy = 0 ---> LEAK = 1 
  # |           5:  zeModuleCreate = 1 \---> zeModuleDestroy = 1  
  # |           6:  zeKernelCreate = 1 \---> zeKernelDestroy = 1  
  # |           7:  zeEventPoolCreate = 2 \---> zeEventPoolDestroy = 0 ---> LEAK = 2 
  # |           8:  zeCommandListCreateImmediate = 1 |

The reported leaks also seem to align with https://github.com/intel/llvm/issues/14473. Propose moving to unsupported for now: https://github.com/intel/llvm/pull/20792

mmichel11 avatar Dec 02 '25 04:12 mmichel11