8369515: Deadlock between JVMTI and JNI ReleasePrimitiveArrayCritical
As discussed in JBS the deadlock occurs when the call to ReleasePrimitiveArrayCritical performs the transition from native to VM, and in the process checks for special runtime exit conditions - which includes the obj_deopt_suspend request. The simple solution is to define a custom JNI ENTRY with custom ThreadInVMfromNative that elides the exit check.
The change is limited to ReleasePrimitiveArrayCritical and ReleaseStringCritical.
There is no regression test as this has only been seen in long running stress tests.
Testing: -tiers 1-6
Progress
- [ ] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
- [x] Change must not contain extraneous whitespace
- [x] Commit message must refer to an issue
Issue
- JDK-8369515: Deadlock between JVMTI and JNI ReleasePrimitiveArrayCritical (Bug - P3)
Reviewing
Using git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/28779/head:pull/28779
$ git checkout pull/28779
Update a local copy of the PR:
$ git checkout pull/28779
$ git pull https://git.openjdk.org/jdk.git pull/28779/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 28779
View PR using the GUI difftool:
$ git pr show -t 28779
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/28779.diff
Using Webrev
:wave: Welcome back dholmes! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.
@dholmes-ora This change now passes all automated pre-integration checks.
ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.
After integration, the commit message for the final commit will be:
8369515: Deadlock between JVMTI and JNI ReleasePrimitiveArrayCritical
Co-authored-by: Richard Reingruber <[email protected]>
Reviewed-by: rrich, fbredberg, pchilanomate
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.
At the time when this comment was updated there had been 52 new commits pushed to the master branch:
- b5ac8f83682ddb9623a1b43bd62f309b2961a504: 8373246: JDK-8351842 broke native debugging on Linux
- 45642acf1b290306509375e58bde8f6c9cd1b308: 8373710: Improve jpackage error reporting
- 14c93b2afbf0135e872866c7f8468d9ce6df1e0d: 8373537: Migrate "test/jdk/com/sun/net/httpserver/" to null-safe "SimpleSSLContext" methods
- ... and 49 more: https://git.openjdk.org/jdk/compare/3f07710270dbe7268f21828dff20e2eb810b1e70...master
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.
➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.
@dholmes-ora The following label will be automatically applied to this pull request:
-
hotspot
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.
/label add hotspot-runtime /label add serviceability
@dholmes-ora
The hotspot-runtime label was successfully added.
@dholmes-ora
The serviceability label was successfully added.
Thanks for providing a fix for the issue David. I've scheduled some local testing.
@dholmes-ora I do think that also entering a critical region is problematic if it is nested. I'm currently testing with https://github.com/reinrich/jdk/commit/d7ce2ccb2150c92929b8d9140b4709833d188474 where a thread doesn't suspend for EscapeBarriers while in a critical region.
I do think that also entering a critical region is problematic if it is nested.
Isn't nesting critical regions against the rules of using critical regions?
I do think that also entering a critical region is problematic if it is nested.
Isn't nesting critical regions against the rules of using critical regions?
For arrays it's explicitly allowed in the specification. There's also an example how to do it properly :) I'll see if I can implement a reproducer.
I do think that also entering a critical region is problematic if it is nested.
Isn't nesting critical regions against the rules of using critical regions?
For arrays it's explicitly allowed in the specification. There's also an example how to do it properly :) I'll see if I can implement a reproducer.
Right I see that. It states:
Inside a critical region, native code must not call other JNI functions, ...
but then explicitly allows the critical functions themselves to be an exception. Okay.
So isn't a fix for this simply to skip blocking as per your PR:
if (is_obj_deopt_suspend() && !in_critical()) {
irrespective of nesting and without any need for any of the changes I proposed?
So isn't a fix for this simply to skip blocking as per your PR:
if (is_obj_deopt_suspend() && !in_critical()) {irrespective of nesting and without any need for any of the changes I proposed?
Yes, I currently think so. Testing so far is good. Don't have a reproducer yet for the deadlock.
I tested tiers 1-6 with the simplified fix and nothing has turned up.
Thanks! Same with my testing. I doubt that I'll find the time to write a reproducer though.
/contributor add @reinrich
@dholmes-ora
Contributor Richard Reingruber <[email protected]> successfully added.
FTR we are not processing this special exit condition when transitioning from _thread_blocked to _thread_in_vm already, so I would assume it should be also okay for any native->vm transition.
Thanks for the reviews @fbredber and @pchilano !
I wonder why there are no GHA tests?
I wonder why there are no GHA tests?
I don't run GHA by default.
A deadlock can still occur if the debugger suspends the thread in the critical region, e.g. to read a local variable. After JDK-8373839 this shouldn't be possible anymore.