jdk icon indicating copy to clipboard operation
jdk copied to clipboard

8373630: r18_tls should not be modified on Windows AArch64

Open swesonga opened this issue 1 month ago • 9 comments

On Windows, r18_tls is used to store the pointer to the current thread's TEB. Therefore, this register should never be modified (see details in register_aarch64.hpp). One scenario that results in the modification of r18_tls involves virtual threads on Windows. Frames are frozen by Continuation::try_preempt on one carrier thread whose registers are saved. When the frame is thawed, execution can continue on a different carrier thread. When this happens, rthread (x28) is fixed to point to the new carrier thread. The continuation then results in restore_live_registers restoring all the saved registers (including the fixed rthread register). However, this also restores x18, which was the TEB pointer for the previous carrier thread, causing the new carrier thread to execute with the TLS of the previous carrier thread. This causes hangs and occasional crashes in the virtual threads jtreg tests on Windows AArch64 that are resolved by this fix.


Progress

  • [x] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • [x] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue

Issue

  • JDK-8373630: r18_tls should not be modified on Windows AArch64 (Bug - P1)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/28808/head:pull/28808
$ git checkout pull/28808

Update a local copy of the PR:
$ git checkout pull/28808
$ git pull https://git.openjdk.org/jdk.git pull/28808/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 28808

View PR using the GUI difftool:
$ git pr show -t 28808

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/28808.diff

Using Webrev

Link to Webrev Comment

swesonga avatar Dec 12 '25 22:12 swesonga

:wave: Welcome back swesonga! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Dec 12 '25 22:12 bridgekeeper[bot]

@swesonga This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8373630: r18_tls should not be modified on Windows AArch64

Reviewed-by: pchilanomate, aph

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 39 new commits pushed to the master branch:

  • b0b42e7eb14dbe04c9c00e8d1fda139a502f2120: 8373615: Improve HotSpot debug functions findclass() and findmethod
  • 81e375768837e1ae6c34c1d0a8eff06b4e1d2889: 8373566: Performance regression with java.text.MessageFormat subformat patterns
  • 76e79dbb3eca5589aae6852c8f55adf0759c714e: 8371716: C2: Phi node fails Value()'s verification when speculative types clash
  • ... and 36 more: https://git.openjdk.org/jdk/compare/23c39757ecdc834c631f98f4487cfea21c9b948b...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@pchilano, @theRealAph) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

openjdk[bot] avatar Dec 12 '25 22:12 openjdk[bot]

@swesonga The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

openjdk[bot] avatar Dec 12 '25 22:12 openjdk[bot]

Webrevs

mlbridge[bot] avatar Dec 12 '25 23:12 mlbridge[bot]

Shouldn't the #ifdef be using R18_RESERVED?

dean-long avatar Dec 13 '25 00:12 dean-long

Nice find. It would be really useful to have a test case that reproduces the problem, and also some idea of how likely it is. I bumped the bug to P1 for now.

dean-long avatar Dec 13 '25 00:12 dean-long

Shouldn't the #ifdef be using R18_RESERVED?

Yes, I have changed the condition to R18_RESERVED

swesonga avatar Dec 13 '25 05:12 swesonga

Nice find. It would be really useful to have a test case that reproduces the problem, and also some idea of how likely it is. I bumped the bug to P1 for now.

The virtual threads MonitorEnterExit test has a 100% failure repro rate on Windows AArch64 without this change (but it does not fail on macosx-aarch64 without this change, even though x18 is also reserved on macosx-aarch64). I was specifically running the testMutualExclusion parametized test with 0 platform threads and at least 2 virtual threads when investigating this behavior.

swesonga avatar Dec 13 '25 05:12 swesonga

I think this is OK for a quick fix for the upcoming release, but in future save/restore should be fixed so that they exclude r18_tls.

theRealAph avatar Dec 13 '25 09:12 theRealAph

/integrate

swesonga avatar Dec 16 '25 15:12 swesonga

@swesonga Your change (at version e5a9ef0ef28947361cd9d680a55eb8d4b1fec73c) is now ready to be sponsored by a Committer.

openjdk[bot] avatar Dec 16 '25 15:12 openjdk[bot]

/sponsor

theRealAph avatar Dec 16 '25 18:12 theRealAph

Going to push as commit a0dd66f92d7f8400b9800847e36d036315628afb. Since your change was applied there have been 39 commits pushed to the master branch:

  • b0b42e7eb14dbe04c9c00e8d1fda139a502f2120: 8373615: Improve HotSpot debug functions findclass() and findmethod
  • 81e375768837e1ae6c34c1d0a8eff06b4e1d2889: 8373566: Performance regression with java.text.MessageFormat subformat patterns
  • 76e79dbb3eca5589aae6852c8f55adf0759c714e: 8371716: C2: Phi node fails Value()'s verification when speculative types clash
  • ... and 36 more: https://git.openjdk.org/jdk/compare/23c39757ecdc834c631f98f4487cfea21c9b948b...master

Your commit was automatically rebased without conflicts.

openjdk[bot] avatar Dec 16 '25 18:12 openjdk[bot]

@theRealAph @swesonga Pushed as commit a0dd66f92d7f8400b9800847e36d036315628afb.

:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

openjdk[bot] avatar Dec 16 '25 18:12 openjdk[bot]

What a bug! @swesonga, are you handling backports?

shipilev avatar Dec 17 '25 19:12 shipilev

What a bug! @swesonga, are you handling backports?

Yes, I'm preparing the jdk26u backport this afternoon

swesonga avatar Dec 17 '25 20:12 swesonga

Yes, I'm preparing the jdk26u backport this afternoon

You want to backport to the jdk26 branch not the 26u repo.

dholmes-ora avatar Dec 19 '25 00:12 dholmes-ora