jtreg icon indicating copy to clipboard operation
jtreg copied to clipboard

7901757: Race in counting total number of failures from TestNG

Open jaikiran opened this issue 1 year ago • 3 comments

Can I please get a review of this change which proposes to fix the issue noted in https://bugs.openjdk.org/browse/CODETOOLS-7901757?

As noted there, the current implementation in jtreg's testng test listener doesn't take into account that the callback methods on that listener instance can be invoked concurrently for different threads. As a result, the count tracking logic in the listener ends up in a race condition which results in reporting incorrect numbers.

The commit in this PR fixes that issue and introduces a self test which reproduces the race issue and verifies the fix. The new test and existing tests continue to pass with this change.


Progress

  • [ ] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • [x] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue

Issue

  • CODETOOLS-7901757: Race in counting total number of failures from TestNG (Bug - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jtreg.git pull/216/head:pull/216
$ git checkout pull/216

Update a local copy of the PR:
$ git checkout pull/216
$ git pull https://git.openjdk.org/jtreg.git pull/216/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 216

View PR using the GUI difftool:
$ git pr show -t 216

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jtreg/pull/216.diff

Webrev

Link to Webrev Comment

jaikiran avatar Jul 26 '24 09:07 jaikiran

:wave: Welcome back jpai! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Jul 26 '24 09:07 bridgekeeper[bot]

@jaikiran This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

7901757: Race in counting total number of failures from TestNG

Reviewed-by: cstein, jjg

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 2 new commits pushed to the master branch:

  • feee5044a785d6dcef2b3b82f7d8d11daf09e594: 7903784: NullPointerException: Cannot read the array length because the return value of "java.io.File.listFiles()" is null
  • a9172d632af9059c0df2cc3c8316866654a98dd0: 7903759: Amend changelog to include the removal of jtdiff

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch. As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk[bot] avatar Jul 26 '24 09:07 openjdk[bot]

Webrevs

mlbridge[bot] avatar Jul 26 '24 09:07 mlbridge[bot]

Hello Christian,

With soon correct numbers being reported, will we see new errors in the OpenJDK (and other projects') test suites?

I don't expect the change in this PR to expose failures that have gone undetected so far, because the issue here is a race condition that could have incorrectly reported the failure count. For the failure to be not noticed previously, then it would mean that the race would have had to happen every single run of that test, which would be very odd.

In any case, in the next few days, I will include this change as part of the JDK tier testing I run as part of new jtreg changes and see how it goes.

jaikiran avatar Aug 15 '24 12:08 jaikiran

We should make sure that general jtreg documentation accurately describes when tests may or do not run concurrently

jonathan-gibbons avatar Aug 15 '24 19:08 jonathan-gibbons

In any case, in the next few days, I will include this change as part of the JDK tier testing I run as part of new jtreg changes and see how it goes.

I tried this against the JDK mainline and although the run had failures those failures are unrelated to this change. I will go ahead and integrate this change later today.

jaikiran avatar Aug 21 '24 01:08 jaikiran

/integrate

jaikiran avatar Aug 21 '24 05:08 jaikiran

Going to push as commit 3750f09675011ff237a65b1c623507e83d17796f. Since your change was applied there have been 5 commits pushed to the master branch:

  • 5d74713fc66aac9829b188bd3fc19ce3e2c5a812: 7903793: Latent typo in ReportOnlyTest.gmk
  • 789f00d3b54d32ef1dd1270439981676eab6d19b: 7903792: ReportOnlyTest may fail in "headless" environments
  • ca8c1ecf924c4ea40fc13a907559aaa90f3d0196: 7903765: wget failed in build.sh in jtreg
  • feee5044a785d6dcef2b3b82f7d8d11daf09e594: 7903784: NullPointerException: Cannot read the array length because the return value of "java.io.File.listFiles()" is null
  • a9172d632af9059c0df2cc3c8316866654a98dd0: 7903759: Amend changelog to include the removal of jtdiff

Your commit was automatically rebased without conflicts.

openjdk[bot] avatar Aug 21 '24 05:08 openjdk[bot]

@jaikiran Pushed as commit 3750f09675011ff237a65b1c623507e83d17796f.

:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

openjdk[bot] avatar Aug 21 '24 05:08 openjdk[bot]