llvm icon indicating copy to clipboard operation
llvm copied to clipboard

[CI] GPU type not passed to lit

Open sarnex opened this issue 2 years ago • 6 comments
trafficstars

We have lit variables like gpu-intel-gen12, but in our CI those are never set, causing them to have no effect.

I tried to disable a test on gen12, but we don't ever set the var, so it still runs on gen12.

 # Run E2E tests.
  export LIT_OPTS="-v --no-progress-bar --show-unsupported --max-time 3600 --time-tests"
  cmake --build build-e2e --target check-sycl-e2e

sarnex avatar Apr 27 '23 15:04 sarnex

@intel/dpcpp-devops-reviewers FYI

sarnex avatar Apr 27 '23 15:04 sarnex

Our current plan is to have lit automatically detect device aspects and set those as features. Once done, we can change many UNSUPPORTED: gpu-intel-pvc to REQUIRES: aspect-image. However, it seems that the case you're interested in won't be solved by this.

We also had a brief discussion on whether we want to have similar dynamic detection of gpu-intel-pvc and alike, but I don't quite remember the conclusion.

aelovikov-intel avatar Apr 27 '23 15:04 aelovikov-intel

I think we will probably need some way to disable a test per-gpu, we may not know what aspect is causing the test to fail, especially if the person disabling the test is not the test author.

sarnex avatar Apr 27 '23 15:04 sarnex

Ideally, we should never have a need to disable a test on a particular kind of device, because all optional functionality should be checked through aspects. However, there could be device-specific bugs.

It is very annoying to run tests locally on a HW like PVC which has a few tests disables specifically for it. Because of lack of autodetection, it means that local E2E run always contains some failures and it may mean that first revision of a PR I submit will fail on pre-commit, because I missed some failure I introduced due to the noise.

I think that we should use queries added by sycl_ext_oneapi_device_architecture to detect the exact device and automatically add it as a LIT feature, so all our UNSUPPORTED/XFAIL mechanisms work correctly in all environments.

AlexeySachkov avatar Mar 18 '24 16:03 AlexeySachkov

This should be addressed by #13976

AlexeySachkov avatar Jun 18 '24 15:06 AlexeySachkov

Hi! There have been no updates for at least the last 60 days, though the issue has assignee(s).

@jzc, could you please take one of the following actions:

  • provide an update if you have any
  • unassign yourself if you're not looking / going to look into this issue
  • mark this issue with the 'confirmed' label if you have confirmed the problem/request and our team should work on it
  • close the issue if it has been resolved
  • take any other suitable action.

Thanks!

github-actions[bot] avatar Aug 18 '24 00:08 github-actions[bot]

Hi! There have been no updates for at least the last 60 days, though the issue has assignee(s).

@jzc, could you please take one of the following actions:

  • provide an update if you have any
  • unassign yourself if you're not looking / going to look into this issue
  • mark this issue with the 'confirmed' label if you have confirmed the problem/request and our team should work on it
  • close the issue if it has been resolved
  • take any other suitable action.

Thanks!

github-actions[bot] avatar Oct 17 '24 00:10 github-actions[bot]

Fixed by https://github.com/intel/llvm/pull/13976

jzc avatar Oct 17 '24 16:10 jzc