openj9
openj9 copied to clipboard
Add SIGUSR2 handler and matching -Xdump event
@pshipton @keithc-ca FYI
Notes:
SIGUSR1seems to be used internally but I couldn't find any use ofSIGUSR2- Mimicked
sigquit.c - Tentatively chose "sigusr" as the
-Xdumpevent (trieduser2but numerals not accepted) - No-op on Windows
- Unclear if this event needs to be part of the SigQuit thread processing
- Removed the call to
TRIGGER_J9HOOK_VM_USER_INTERRUPTinsigUsr2Handler - Not sharing exclusive access to resolve issue #9256
Before the patch, the process exits:
$ printf "class Hang { public static void main(String... args) throws Throwable { Object o = new Object(); synchronized (o) { o.wait(); } } }" > Hang.java
$ javac Hang.java
$ java Hang &
$ kill -USR2 %1
[1]+ User defined signal 2: 31 java Hang
After the patch, javacore is produced and process continues running:
$ java Hang &
$ kill -USR2 %1
JVMDUMP039I Processing dump event "sigusr", detail "" at 2022/07/27 15:01:51 - please wait.
JVMDUMP032I JVM requested Java dump using '/Users/kevin/git/openj9-openjdk-jdk8/build/macosx-x86_64-normal-server-release/images/j2sdk-image/javacore.20220727.150151.71696.0001.txt' in response to an event
JVMDUMP010I Java dump written to /Users/kevin/git/openj9-openjdk-jdk8/build/macosx-x86_64-normal-server-release/images/j2sdk-image/javacore.20220727.150151.71696.0001.txt
JVMDUMP013I Processed dump event "sigusr", detail "".
Customizing the sigusr event also works:
$ java -Xdump:system:events=sigusr,request=exclusive+prepwalk Hang &
$ kill -USR2 %1
JVMDUMP039I Processing dump event "sigusr", detail "" at 2022/07/27 15:03:21 - please wait.
JVMDUMP032I JVM requested System dump using '/Users/kevin/git/openj9-openjdk-jdk8/build/macosx-x86_64-normal-server-release/images/j2sdk-image/core.20220727.150321.71712.0001.dmp' in response to an event
JVMDUMP010I System dump written to /Users/kevin/git/openj9-openjdk-jdk8/build/macosx-x86_64-normal-server-release/images/j2sdk-image/core.20220727.150321.71712.0001.dmp
JVMDUMP032I JVM requested Java dump using '/Users/kevin/git/openj9-openjdk-jdk8/build/macosx-x86_64-normal-server-release/images/j2sdk-image/javacore.20220727.150321.71712.0002.txt' in response to an event
JVMDUMP010I Java dump written to /Users/kevin/git/openj9-openjdk-jdk8/build/macosx-x86_64-normal-server-release/images/j2sdk-image/javacore.20220727.150321.71712.0002.txt
JVMDUMP013I Processed dump event "sigusr", detail "".
We need a documentation issue created for this.
Unclear if this event needs to be part of the SigQuit thread processing
Looks like yes it does
sigusr may be confusing. Some other ideas: usertwo, altuser, or look into why numbers aren't accepted and fix that if possible.
We need a documentation issue created for this.
Sure, I can do that.
Unclear if this event needs to be part of the SigQuit thread processing
Looks like yes it does
Ok, I'll add that in.
sigusr may be confusing. Some other ideas: usertwo, altuser, or look into why numbers aren't accepted and fix that if possible.
Sure, I don't have a strong opinion on the event name. @keithc-ca any opinion?
There are still several places that declare/use things related to SIGUSR2 that are not conditional on the enabling flag.
I think this should be an opt-in feature: a user must explicitly request handling SIGUSR2 via
-Xdump:java:...options so there isn't a conflict with existing uses of that signal.
@keithc-ca Makes sense. I'll remove the default change. I didn't fully understand the eventMask fields in rasDumpSpecs in dmpagent.c - are those changing defaults or specifying what events can drive those agents?
I'm out of the office until next week but the rest of the comments make sense and I'll update then.
I'm not sure why you had trouble using user2 as the event name; I don't see anything that should object to digits, it just needs to match the entry in dmpagent.c: rasDumpEvents.
I meant to say in my previous comment that "user2" is my preference for the new event name.
I thought of why user2 wasn't being parsed: It was complaining about unresolved tokens starting at 2, so it was resolving the user signal and then the 2 was left over, so I'll just need to place the user2 definition above the user definition.
@keithc-ca @pshipton Updated and squashed based on feedback.
$ java -Xdump:java:events=user2,request=exclusive+prepwalk Hang &
[1] 31864
$ kill -USR2 %1
JVMDUMP039I Processing dump event "user2", detail "" at 2022/08/01 10:02:40 - please wait.
JVMDUMP032I JVM requested Java dump using '/Users/kevin/git/openj9-openjdk-jdk8/build/macosx-x86_64-normal-server-release/images/j2sdk-image/javacore.20220801.100240.31864.0001.txt' in response to an event
JVMDUMP010I Java dump written to /Users/kevin/git/openj9-openjdk-jdk8/build/macosx-x86_64-normal-server-release/images/j2sdk-image/javacore.20220801.100240.31864.0001.txt
JVMDUMP013I Processed dump event "user2", detail "".
There is still a minor default change in that, previously, SIGUSR2 would cause the process to exit:
$ java Hang &
[1] 32102
$ kill -USR2 %1
[1]+ User defined signal 2: 31 java Hang
Now, even if no -Xdump event is registered, the process no longer exits:
$ java Hang &
[1] 31858
$ kill -USR2 %1
$
jenkins compile win jdk8
I thought of why
user2wasn't being parsed: It was complaining about unresolved tokens starting at2, so it was resolving theusersignal and then the2was left over, so I'll just need to place theuser2definition above theuserdefinition.
Yuck! I think a comment is warranted in that list (I was going to suggest they be ordered alphabetically).
jenkins compile win jdk8
@keithc-ca Feedback processed, please re-review
jenkins test sanity win,win32 jdk8
jenkins test sanity osx,zlinux jdk17
The still running PR testing is https://openj9-jenkins.osuosl.org/job/PullRequest-OpenJ9/2544/
The still running PR testing is https://openj9-jenkins.osuosl.org/job/PullRequest-OpenJ9/2544/
I was aware, just getting ready.
However, I would like to see this squashed after that testing is complete.
For the record, test builds are:
- https://openj9-jenkins.osuosl.org/job/PullRequest-OpenJ9/2544/
- https://openj9-jenkins.osuosl.org/job/PullRequest-OpenJ9/2545/
Tests passed; squashed.