jdk icon indicating copy to clipboard operation
jdk copied to clipboard

8369515: Deadlock between JVMTI and JNI ReleasePrimitiveArrayCritical

Open dholmes-ora opened this issue 1 month ago • 10 comments

As discussed in JBS the deadlock occurs when the call to ReleasePrimitiveArrayCritical performs the transition from native to VM, and in the process checks for special runtime exit conditions - which includes the obj_deopt_suspend request. The simple solution is to define a custom JNI ENTRY with custom ThreadInVMfromNative that elides the exit check.

The change is limited to ReleasePrimitiveArrayCritical and ReleaseStringCritical.

There is no regression test as this has only been seen in long running stress tests.

Testing: -tiers 1-6


Progress

  • [ ] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • [x] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue

Issue

  • JDK-8369515: Deadlock between JVMTI and JNI ReleasePrimitiveArrayCritical (Bug - P3)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/28779/head:pull/28779
$ git checkout pull/28779

Update a local copy of the PR:
$ git checkout pull/28779
$ git pull https://git.openjdk.org/jdk.git pull/28779/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 28779

View PR using the GUI difftool:
$ git pr show -t 28779

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/28779.diff

Using Webrev

Link to Webrev Comment

dholmes-ora avatar Dec 12 '25 04:12 dholmes-ora

:wave: Welcome back dholmes! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Dec 12 '25 04:12 bridgekeeper[bot]

@dholmes-ora This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8369515: Deadlock between JVMTI and JNI ReleasePrimitiveArrayCritical

Co-authored-by: Richard Reingruber <[email protected]>
Reviewed-by: rrich, fbredberg, pchilanomate

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 52 new commits pushed to the master branch:

  • b5ac8f83682ddb9623a1b43bd62f309b2961a504: 8373246: JDK-8351842 broke native debugging on Linux
  • 45642acf1b290306509375e58bde8f6c9cd1b308: 8373710: Improve jpackage error reporting
  • 14c93b2afbf0135e872866c7f8468d9ce6df1e0d: 8373537: Migrate "test/jdk/com/sun/net/httpserver/" to null-safe "SimpleSSLContext" methods
  • ... and 49 more: https://git.openjdk.org/jdk/compare/3f07710270dbe7268f21828dff20e2eb810b1e70...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk[bot] avatar Dec 12 '25 04:12 openjdk[bot]

@dholmes-ora The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

openjdk[bot] avatar Dec 12 '25 04:12 openjdk[bot]

/label add hotspot-runtime /label add serviceability

dholmes-ora avatar Dec 15 '25 04:12 dholmes-ora

@dholmes-ora The hotspot-runtime label was successfully added.

openjdk[bot] avatar Dec 15 '25 04:12 openjdk[bot]

@dholmes-ora The serviceability label was successfully added.

openjdk[bot] avatar Dec 15 '25 04:12 openjdk[bot]

Webrevs

mlbridge[bot] avatar Dec 15 '25 04:12 mlbridge[bot]

Thanks for providing a fix for the issue David. I've scheduled some local testing.

reinrich avatar Dec 15 '25 10:12 reinrich

@dholmes-ora I do think that also entering a critical region is problematic if it is nested. I'm currently testing with https://github.com/reinrich/jdk/commit/d7ce2ccb2150c92929b8d9140b4709833d188474 where a thread doesn't suspend for EscapeBarriers while in a critical region.

reinrich avatar Dec 15 '25 15:12 reinrich

I do think that also entering a critical region is problematic if it is nested.

Isn't nesting critical regions against the rules of using critical regions?

dholmes-ora avatar Dec 16 '25 01:12 dholmes-ora

I do think that also entering a critical region is problematic if it is nested.

Isn't nesting critical regions against the rules of using critical regions?

For arrays it's explicitly allowed in the specification. There's also an example how to do it properly :) I'll see if I can implement a reproducer.

reinrich avatar Dec 16 '25 08:12 reinrich

I do think that also entering a critical region is problematic if it is nested.

Isn't nesting critical regions against the rules of using critical regions?

For arrays it's explicitly allowed in the specification. There's also an example how to do it properly :) I'll see if I can implement a reproducer.

Right I see that. It states:

Inside a critical region, native code must not call other JNI functions, ...

but then explicitly allows the critical functions themselves to be an exception. Okay.

So isn't a fix for this simply to skip blocking as per your PR:

 if (is_obj_deopt_suspend() && !in_critical()) {

irrespective of nesting and without any need for any of the changes I proposed?

dholmes-ora avatar Dec 16 '25 11:12 dholmes-ora

So isn't a fix for this simply to skip blocking as per your PR:

 if (is_obj_deopt_suspend() && !in_critical()) {

irrespective of nesting and without any need for any of the changes I proposed?

Yes, I currently think so. Testing so far is good. Don't have a reproducer yet for the deadlock.

reinrich avatar Dec 16 '25 12:12 reinrich

I tested tiers 1-6 with the simplified fix and nothing has turned up.

dholmes-ora avatar Dec 17 '25 21:12 dholmes-ora

Thanks! Same with my testing. I doubt that I'll find the time to write a reproducer though.

reinrich avatar Dec 18 '25 00:12 reinrich

/contributor add @reinrich

dholmes-ora avatar Dec 18 '25 01:12 dholmes-ora

@dholmes-ora Contributor Richard Reingruber <[email protected]> successfully added.

openjdk[bot] avatar Dec 18 '25 01:12 openjdk[bot]

FTR we are not processing this special exit condition when transitioning from _thread_blocked to _thread_in_vm already, so I would assume it should be also okay for any native->vm transition.

pchilano avatar Dec 18 '25 18:12 pchilano

Thanks for the reviews @fbredber and @pchilano !

dholmes-ora avatar Dec 19 '25 01:12 dholmes-ora

I wonder why there are no GHA tests?

reinrich avatar Dec 19 '25 09:12 reinrich

I wonder why there are no GHA tests?

I don't run GHA by default.

dholmes-ora avatar Dec 19 '25 12:12 dholmes-ora

A deadlock can still occur if the debugger suspends the thread in the critical region, e.g. to read a local variable. After JDK-8373839 this shouldn't be possible anymore.

reinrich avatar Dec 19 '25 13:12 reinrich