jdk icon indicating copy to clipboard operation
jdk copied to clipboard

8294211: Zero: Decode arch-specific error context if possible

Open shipilev opened this issue 3 years ago • 8 comments
trafficstars

After POSIX signal refactorings, Zero error handling had "regressed" a bit: Zero always gets NULL as pc in error handling code, and thus it fails with SEGV at pc=0x0. We can do better by implementing context decoding where possible.

Unfortunately, this introduces some arch-specific code in Zero code. The arch-specific code is copy-pasted (with inline definitions, if needed) from the relevant os_linux_*.cpp files. The unimplemented arches would still report the same confusing hs_err-s. We can emulate (and thus test) the generic behavior using new diagnostic VM option.

This reverts parts of JDK-8259392.

Sample test:

import java.lang.reflect.*;
import sun.misc.Unsafe;

public class Crash {
  public static void main(String... args) throws Exception {
    Field f = Unsafe.class.getDeclaredField("theUnsafe");
    f.setAccessible(true);
    Unsafe u = (Unsafe) f.get(null);
    u.getInt(42); // accesing via broken ptr
  }
}

Linux x86_64 Zero fastdebug crash currently:

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000000000000000, pid=538793, tid=538794
#
...
# (no native frame info)
...
siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x000000000000002a

Linux x86_64 Zero fastdebug crash with this patch:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fbbbf08b584, pid=520119, tid=520120
#
...
# Problematic frame:
# V  [libjvm.so+0xcbe584]  Unsafe_GetInt+0xe4
....
siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x000000000000002a

Linux x86_64 Zero fastdebug crash with this patch and -XX:-DecodeErrorContext:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000000000000000, pid=520268, tid=520269
#
...
# Problematic frame:
# C  0x0000000000000000
...
siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x000000000000002a

Additional testing:

  • [x] Linux x86_64 Zero fastdebug eyeballing crash logs
  • [x] Linux x86_64 Zero fastdebug, tier1
  • [x] Linux {x86_64, x86_32, aarch64, arm, riscv64, s390x, ppc64le, ppc64be} Zero fastdebug builds

Progress

  • [x] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • [x] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue

Issue

  • JDK-8294211: Zero: Decode arch-specific error context if possible

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/10397/head:pull/10397
$ git checkout pull/10397

Update a local copy of the PR:
$ git checkout pull/10397
$ git pull https://git.openjdk.org/jdk pull/10397/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 10397

View PR using the GUI difftool:
$ git pr show -t 10397

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/10397.diff

shipilev avatar Sep 22 '22 18:09 shipilev

:wave: Welcome back shade! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Sep 22 '22 18:09 bridgekeeper[bot]

@shipilev The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

openjdk[bot] avatar Sep 22 '22 18:09 openjdk[bot]

Good! But why make this conditional with a switch? Who would not want to have better error information?

Because I want to be able to test the generic error handling paths that would run on "generic" arch, without leaving the comfort of my x86_64 machine :)

shipilev avatar Sep 22 '22 18:09 shipilev

Good! But why make this conditional with a switch? Who would not want to have better error information?

Because I want to be able to test the generic error handling paths that would run on "generic" arch, without leaving the comfort of my x86_64 machine :)

:-) Okay. Like zero-in-zero.

tstuefe avatar Sep 22 '22 19:09 tstuefe

@shipilev This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8294211: Zero: Decode arch-specific error context if possible

Reviewed-by: stuefe, luhenry

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the master branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk[bot] avatar Sep 22 '22 19:09 openjdk[bot]

I think this still works. Any other reviews, please?

shipilev avatar Sep 29 '22 16:09 shipilev

I think this still works. Any other reviews, please?

Ping. :)

shipilev avatar Oct 13 '22 19:10 shipilev

/integrate

shipilev avatar Oct 19 '22 08:10 shipilev

Going to push as commit 3f3d63d02ada66d5739e690d786684d25dc59004. Since your change was applied there have been 48 commits pushed to the master branch:

  • f502ab85c987be827d36b0a29f77ec5ce5bb3d01: 8295435: Build failure with GCC7 after JDK-8294314 due to strict-overflow warnings
  • 3f4964f83d6f03efbee2fb34aa8258d4fc923efb: 8293291: Simplify relocation of native pointers in archive heap
  • 1553551d821d92e529116e6ce56846831b13f492: 8286918: Better HttpServer service
  • 400aa2fb2c00c783f08b8e8dfc0ef9e63cbc4607: 8286511: Improve macro allocation
  • 2cee77444feb7911dc2234cbde0dccee4e6279c9: 8289366: Improve HTTP/2 client usage
  • 1ae683652134782745c4a7f261af3cbfc241e683: 8288508: Enhance ECDSA usage
  • 40539de8da78294a6d0ff0236687817cd767754b: 8286910: Improve JNDI lookups
  • 896a29dfaef6f0fb8e90b85205b599923d6e9e53: 8287446: Enhance icon presentations
  • 5a8e5ea3e234dc50935c09519791a59ee84f08c0: 8286526: Improve NTLM support
  • c622d56a0da5c27490bbe8ec572865b934499833: 8286519: Better memory handling
  • ... and 38 more: https://git.openjdk.org/jdk/compare/172006c0e9433046252bd79e8864890ab7c0ce56...master

Your commit was automatically rebased without conflicts.

openjdk[bot] avatar Oct 19 '22 08:10 openjdk[bot]

@shipilev Pushed as commit 3f3d63d02ada66d5739e690d786684d25dc59004.

:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

openjdk[bot] avatar Oct 19 '22 08:10 openjdk[bot]