clusterfuzz icon indicating copy to clipboard operation
clusterfuzz copied to clipboard

Iterate through all revisions to find first good revision before bisecting.

Open letitz opened this issue 1 year ago • 2 comments

This change builds on #3927 to change the way regression task operates.

To review, given GitHub's lack of support for stacked PRs, this link should work: https://github.com/google/clusterfuzz/pull/3934/files/1bdca4468269ee897b5145f6dc24c395b6eb4d2d..dc74698caba2f0ce6512eca69d93d41d1692aef7

The overall goal is to be more dogged when trying to find the first good revision to start bisecting from, and try all revisions. This may take too long, so we respect the task deadline and ensure that this step is resumable in a subsequent regression task. Regression task already supports resuming timed out regressions later, as this is how the core bisection works.

Previously, regression task would start its first run (detected by the absence of last_regression_min or last_regression_max metadata on the testcase) by:

  • checking whether the regression happened in the last N revisions, ending early if so, having found the regression range
  • checking whether one of the first N revisions was good, and:
    • if not, ending early with an error (this is what is changing)
    • if so, and the first good revision crashed, ending early, having found the regression range to be [0, min)
    • if so, and the first good revision did not crash, going on to bisect

This change does a few things:

  • splits the notion of first run into two: check separately for last_regression_max and last_regression_min
  • when last_regression_max is absent from the testcase:
    • check the latest N revisions for a regression
    • update last_regression_max on the testcase to indicate this happened
  • when last_regression_min is absent from the testcase:
    • check all revisions sequentially to find the first good revision
    • update last_regression_min as we go along to record our progress
    • time out if this takes too long
  • side benefit: since we now record the progress made in these two steps, we can avoid:
    • testing the same revision in check_latest_revisions and find_earliest_good_revision
    • bisecting revisions that were already tested in check_latest_revisions or find_earliest_good_revision

TODO: An integration test of some kind that checks that forward progress is made after timeouts.

letitz avatar Apr 09 '24 14:04 letitz

@alhijazi PTAL while I work out how to write some kind of integration test.

letitz avatar Apr 09 '24 15:04 letitz

Thanks! Onto @jonathanmetzman, who may have suggestions on how/where to write an overarching test.

letitz avatar Apr 12 '24 12:04 letitz