aqa-tests icon indicating copy to clipboard operation
aqa-tests copied to clipboard

Auto rerun FAILURE state test jobs in test pipeline

Open llxia opened this issue 1 year ago • 2 comments

We support auto-rerun failed tests in the test pipeline (see #3431). It is useful for triaging and rerunning the failed tests automatically. However, this feature requires the test jobs to run to the completion (to get the failed test names). If the test jobs are in the FAILURE state (i.e., machine disconnected, machine out of space, etc), someone still needs to restart the jobs manually. This is very tedious during release time. We should have a feature to Auto rerun FAILURE state test jobs.

TRSS should support displaying the rerun test jobs. We should auto-archive the rerun jobs TAP into the parent job.

To simplify this, we will add this to the parent test job. This means a parent test job may trigger the following child test jobs:

  • parallel child test jobs
  • rerun job for failed tests
  • rerun job(s) for FAILURE child test job(s)

FYI @pshipton @JasonFengJ9

llxia avatar May 27 '24 15:05 llxia

Please also see some of the recent changes done by @sophia-guo for archiving TAP files.

smlambert avatar May 27 '24 15:05 smlambert

yes, all TAP files (child jobs, rerun failed tests, rerun FAILURE test jobs) should be archived.

llxia avatar May 27 '24 15:05 llxia