jdk icon indicating copy to clipboard operation
jdk copied to clipboard

8269881: SA stack dump fails to include stack trace for SteadyStateThread

Open plummercj opened this issue 1 year ago • 6 comments
trafficstars

The completely unrelated fix to JDK-8335124 led me to believe that the issue with sometimes not being able to get the stack trace of the SteadyStateThread might be due to the thread being active for a short period after being reported as in the Thread.State.BLOCKED state. Once set to that state, the thread still needs to call a native OS API to block the thread so it is truly idle. During this time the thread stack might be inconsistent and not walk-able. The fix is to add a short sleep after the thread has moved to the Thread.State.BLOCKED state to give it a chance to finish blocking.

Tested with Tier1 CI and all svc test tasks for tier2 and tier5.


Progress

  • [x] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • [x] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue

Issue

  • JDK-8269881: SA stack dump fails to include stack trace for SteadyStateThread (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19951/head:pull/19951
$ git checkout pull/19951

Update a local copy of the PR:
$ git checkout pull/19951
$ git pull https://git.openjdk.org/jdk.git pull/19951/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 19951

View PR using the GUI difftool:
$ git pr show -t 19951

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19951.diff

Webrev

Link to Webrev Comment

plummercj avatar Jun 28 '24 20:06 plummercj

/label serviceability

plummercj avatar Jun 28 '24 20:06 plummercj

:wave: Welcome back cjplummer! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Jun 28 '24 20:06 bridgekeeper[bot]

@plummercj This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8269881: SA stack dump fails to include stack trace for SteadyStateThread

Reviewed-by: kevinw, sspitsyn, lmesnik

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 239 new commits pushed to the master branch:

  • 21a6cf848da00c795d833f926f831c7aea05dfa3: 8336587: failure_handler lldb command times out on macosx-aarch64 core file
  • 78cc0f9569535c72900cf4617e22cef99f695e61: 8336091: Fix HTML warnings in the generated HTML files
  • bcb5e69505f6cc8a4f323924cd2c58e630595fc0: 8335921: Fix HotSpot VM build without JVMTI
  • 10186ff48fe67aeb83c028b47f6b7e5105513cf3: 8336300: DateFormatSymbols#getInstanceRef returns non-cached instance
  • 7ec55df34af98e9a80381dba7f7f2127f2248f73: 8336638: Parallel: Remove redundant mangle in PSScavenge::invoke
  • 6df7acbc74922d297852044596045a8b32636423: 8299080: Wrong default value of snippet lang attribute
  • 871362870ea8dc5f4ac186876e91023116891a5b: 8334217: [AIX] Misleading error messages after JDK-8320005
  • 67979eb0771ff834d6d3d18ba5a8bfe161cfc2ce: 8334781: JFR crash: assert(((((JfrTraceIdBits::load(klass)) & ((JfrTraceIdEpoch::this_epoch_method_and_class_bits()))) != 0))) failed: invariant
  • d41d2a7a82cb6eff17396717e2e14139ad8179ba: 8334502: gtest/GTestWrapper.java fails on armhf due to LogDecorations.iso8601_utctime_test
  • 59843f4a65c18b9a9cc32d4146e569b0e8c89baf: 8336040: Missing closing anchor element in Docs.gmk
  • ... and 229 more: https://git.openjdk.org/jdk/compare/b5d589623c174757e946011495f771718318f1cc...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk[bot] avatar Jun 28 '24 20:06 openjdk[bot]

@plummercj The serviceability label was successfully added.

openjdk[bot] avatar Jun 28 '24 20:06 openjdk[bot]

Webrevs

mlbridge[bot] avatar Jun 28 '24 20:06 mlbridge[bot]

@kevinjwalls Actually in all cases after launching LingeredApp and waiting for the the SteadyStateThread to be "ready", there is still then the launching of the clhsdb tool, which is going to take some time. Seems hard to believe that the SteadyStateThread would ever lose out on that race.

I get the feeling that maybe there is more going on here than I initially thought. Almost all of these failures are on Windows (about 22 out of 25) with the other 3 on linux-arm. Maybe sometimes there is some sort of OS hiccup that is delaying the SteadyStateThread. In any case, no real harm with this fix, and hopefully it helps

plummercj avatar Jul 01 '24 18:07 plummercj

@plummercj This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bridgekeeper[bot] avatar Aug 15 '24 06:08 bridgekeeper[bot]

@plummercj This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

bridgekeeper[bot] avatar Sep 12 '24 07:09 bridgekeeper[bot]