jdk icon indicating copy to clipboard operation
jdk copied to clipboard

8360707: Globally enumerate all blobs, stubs and entries

Open adinn opened this issue 4 months ago • 4 comments
trafficstars

Use the blob, stub and entry declarations to generate a single global enumeration for all blobs, likewise for all stubs and all entries. Modify stub generators in shared runtime, c1 runtime, c2 runtime and stub generator subsystems and their clients to use those enumerations consistently.


Progress

  • [ ] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • [x] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue

Issue

  • JDK-8360707: Globally enumerate all blobs, stubs and entries (Sub-task - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/26004/head:pull/26004
$ git checkout pull/26004

Update a local copy of the PR:
$ git checkout pull/26004
$ git pull https://git.openjdk.org/jdk.git pull/26004/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 26004

View PR using the GUI difftool:
$ git pr show -t 26004

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/26004.diff

Using Webrev

Link to Webrev Comment

adinn avatar Jun 26 '25 15:06 adinn

:wave: Welcome back adinn! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Jun 26 '25 15:06 bridgekeeper[bot]

@adinn This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8360707: Globally enumerate all blobs, stubs and entries

Reviewed-by: kvn, fyang, asmehra

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 35 new commits pushed to the master branch:

  • 03526e250dfb9ac61f50f482b5dfb330e7fec1bf: 8355960: JvmtiAgentList::Iterator dtor double free with -fno-elide-constructors
  • 1de2acea77da57fd44b214332a73cc6621806e4d: 8361529: GenShen: Fix bad assert in swap card tables
  • fa32bfe11300fdadb35f083037f6ab2a8985d210: 8358529: GenShen: Heuristics do not respond to changes in SoftMaxHeapSize
  • ... and 32 more: https://git.openjdk.org/jdk/compare/f3e0588d0b825a68a4ad61ddf806877f46da69dc...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk[bot] avatar Jun 26 '25 16:06 openjdk[bot]

@adinn The following labels will be automatically applied to this pull request:

  • hotspot
  • shenandoah

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

openjdk[bot] avatar Jun 26 '25 16:06 openjdk[bot]

@adinn I got one test failure on s390: test/hotspot/jtreg/runtime/ErrorHandling/MachCodeFramesInErrorFile.java

java.lang.RuntimeException: 1 < 2
	at jdk.test.lib.Asserts.fail(Asserts.java:715)
	at MachCodeFramesInErrorFile.run(MachCodeFramesInErrorFile.java:170)
	at MachCodeFramesInErrorFile.main(MachCodeFramesInErrorFile.java:108)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:565)
	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:335)
	at java.base/java.lang.Thread.run(Thread.java:1474)

I didn't hs_err even in full verbose. But attaching overall run in txt file: 26004_test_failure.txt

offamitkumar avatar Jun 30 '25 06:06 offamitkumar

@offamitkumar Thanks for testing. I'm looking into the issue and will get back to you.

@TheRealMDoerr @offamitkumar @RealFYang Would you be able to check this on ppc/riscv to ensure it builds (cross-compile is ok) and passes (at least) tier1?

Thanks for whatever help you can provide.

adinn avatar Jun 30 '25 10:06 adinn

@offamitkumar Thanks for testing. I'm looking into the issue and will get back to you.

@TheRealMDoerr @offamitkumar @RealFYang Would you be able to check this on ppc/riscv to ensure it builds (cross-compile is ok) and passes (at least) tier1?

Thanks for whatever help you can provide.

hotspot tier1 has passed on PPC64. Thanks for the ping!

TheRealMDoerr avatar Jun 30 '25 14:06 TheRealMDoerr

@vnkozlov My fastdebug build has passed the following test suites on Linux/aarch64 and Linux/x86:

tier1 jtreg/test/hotspot/jtreg/runtime jtreg/test/hotspot/jtreg/compiler

Could you review the PR and maybe run it through internal testing?

adinn avatar Jun 30 '25 15:06 adinn

@TheRealMDoerr Thanks for testing!

adinn avatar Jun 30 '25 15:06 adinn

@offamitkumar I'm somewhat puzzled by your test result. It indicates that the hserr file generated during the test contained a stack trace for the crash point listing two compiled Java method frames but that only one [MachCode][/MachCode] listing was generated into the file. That seems a bit odd since the other arches seem to generate at least 2 listings.

I am seeing if I can reproduce the error on a borrowed s390 machine.

One thing you might perhaps be able to try meanwhile to check what is going on is to run the test again with env var DEBUG set

$ DEBUG=debug make test TEST=test/hotspot/jtreg/runtime/ErrorHandling/MachCodeFramesInErrorFile.java

This ought to force the contents of the hserr file to be written to the System.err which means they will be visible in file build/linux-s390x-server-fastdebug/test-support/jtreg_test_hotspot_jtreg_runtime_ErrorHandling_MachCodeFramesInErrorFile_java/runtime/ErrorHandling/MachCodeFramesInErrorFile.jtr

The output may get truncated but even then it still might give us some idea of what is going wrong.

adinn avatar Jun 30 '25 15:06 adinn

@offamitkumar I managed to run the test on an s390x build and it passed. I didn't get any hserr output in the jtr file when I set DEBUG on the command line. So, I modified the test to write the hserr file to system.err unconditionially (i.e. I changed the if at line 137 of MachCodeFramesInErrorFile.java to 'if (true)'.

The System.err output in the jtr file contained [MachCode] sections for 4 compiled methods, including the two that appeared in the stack listing (crashInJava3 and crashInJava2). So, I'm not sure why it is failing.

n.b. not all the System.err output ends up in the jtr file because the test harness truncates excessive output. However, I wonder if the problem you are seeing is because the original hserr file is being truncated. It includes a message to that effect starting "Output overflow: ..."

If that is the case and if you can reproduce the problem after modifying the test to write the hserr contents unconditionally then you ought to be able to see all the file contents by setting system property javatest.maxOutputSize to a suitable value (default is 100000).

adinn avatar Jun 30 '25 16:06 adinn

I submitted testing.

vnkozlov avatar Jun 30 '25 18:06 vnkozlov

The System.err output in the jtr file contained [MachCode] sections for 4 compiled methods, including the two that appeared in the stack listing (crashInJava3 and crashInJava2). So, I'm not sure why it is failing.

n.b. not all the System.err output ends up in the jtr file because the test harness truncates excessive output. However, I wonder if the problem you are seeing is because the original hserr file is being truncated. It includes a message to that effect starting "Output overflow: ..."

@adinn is this something which is expected:

Stack: [0x000003ff82980000,0x000003ff82a80000],  sp=0x000003ff82a7e4c8,  free space=1017k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
J 21 c2 MachCodeFramesInErrorFile$Crasher.crashInJava3(J)V (27 bytes) @ 0x000003ff70657c8a [0x000003ff70657c40+0x000000000000004a]
J 20 c2 MachCodeFramesInErrorFile$Crasher.crashInJava2(J)V (13 bytes) @ 0x000003ff706580c0 [0x000003ff70658040+0x0000000000000080]
j  MachCodeFramesInErrorFile$Crasher.crashInJava1(J)V+9
j  MachCodeFramesInErrorFile$Crasher.main([Ljava/lang/String;)V+71
v  ~StubRoutines::Stub Generator call_stub_stub 0x000003ff70580edc

[error occurred during error reporting (printing native stack (with source info)), id 0xe0000000, Internal Error (/home/amit/jdk/src/hotspot/cpu/s390/frame_s390.inline.hpp:40)]

Retrying call stack printing without source information...

I don't see any truncation message in the hs_err output.

offamitkumar avatar Jul 01 '25 06:07 offamitkumar

@offamitkumar Thanks for following up.

is this something which is expected:

Stack: [0x000003ff82980000,0x000003ff82a80000],  sp=0x000003ff82a7e4c8,  free space=1017k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
J 21 c2 MachCodeFramesInErrorFile$Crasher.crashInJava3(J)V (27 bytes) @ 0x000003ff70657c8a [0x000003ff70657c40+0x000000000000004a]
J 20 c2 MachCodeFramesInErrorFile$Crasher.crashInJava2(J)V (13 bytes) @ 0x000003ff706580c0 [0x000003ff70658040+0x0000000000000080]
j  MachCodeFramesInErrorFile$Crasher.crashInJava1(J)V+9
j  MachCodeFramesInErrorFile$Crasher.main([Ljava/lang/String;)V+71
v  ~StubRoutines::Stub Generator call_stub_stub 0x000003ff70580edc

[error occurred during error reporting (printing native stack (with source info)), id 0xe0000000, Internal Error (/home/amit/jdk/src/hotspot/cpu/s390/frame_s390.inline.hpp:40)]

Retrying call stack printing without source information...

That does not look right. Was there any more hserr printout after that last line? -- in particular any [MachCode]...[/MachCode] sections? It would help if you could attach the full .jtr file.

The stack backtrace in the listing above is broken at the frame for a generated stub (call_stub_stub) so it looks like the stack walk is failing. It ought to continue with a C++ frame for JavaCalls::call_helper and so on right up to the frame for start_thread. I'll check to see if perhaps this might be to do with some change to the stub.

adinn avatar Jul 01 '25 07:07 adinn

That does not look right. Was there any more hserr printout after that last line? -- in particular any [MachCode]...[/MachCode] sections? It would help if you could attach the full .jtr file.

There isn't much after this @adinn. It's the hs_err file generated during the test case execution. If you still want to take a look at jtr then I can attach it.

offamitkumar avatar Jul 01 '25 08:07 offamitkumar

@offamitkumar

There isn't much after this @adinn. It's the hs_err file generated during the test case execution. If you still want to take a look at jtr then I can attach it.

Yes please!

adinn avatar Jul 01 '25 08:07 adinn

@RealFYang Thanks for the test and review!

adinn avatar Jul 01 '25 10:07 adinn

@offamitkumar I'm stumped as to why this could be causing any problem on s390x. The error you are seeing appears to be in happening under frame::next_frame during lookup of the parent frame for the call stub frame. That occurs when printing the 'Native frames' part of the hs_err file content.

If I run my build on an s390 box I don't see that error. The frames are printed to the file as expected. Neither can I see any way in which the changes I have made could influence the stack walk code. They do not change the frame size and layout.

@TheRealMDoerr Would you be able to run hotspot jtreg test runtime/ErrorHandling/MachCodeFramesInErrorFile.java on an s390 box and see if you can reproduce the error that @offamitkumar is seeing?

adinn avatar Jul 01 '25 13:07 adinn

@offamitkumar: Is the crash you reported reproducible? Could it be that save_return_pc() is missing in address generate_call_stub(address& return_address)? PPC64 has it.

TheRealMDoerr avatar Jul 01 '25 14:07 TheRealMDoerr

@TheRealMDoerr Hmm, that might possibly be it. It could explain why I am not seeing the error and @offamitkumar is (the failure will depend on what happens to be left lying in the saved PC register). What do you think, Amit?

adinn avatar Jul 01 '25 14:07 adinn

Hi.

Not a review. Here lists our internal test result.

1. VM crashed for arm32

I created one armhf build via cross-compilation and ran java --version via Qemu. Here lists the error log.

$ sudo chroot /sysroot/armhf /tmp/build-armhf/images/jdk/bin/java --version
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/tmp/jdk-src/src/hotspot/share/runtime/stubInfo.cpp:416), pid=117661, tid=117663
#  assert(declaredStub == stub_cursor) failed: Stubgen entry Arm_atomic_store_long_entry declares stub in scope of wrong stub Stub Generator atomic_store_long_stub
#
# JRE version:  (26.0) (fastdebug build )
# Java VM: OpenJDK Server VM (fastdebug 26-internal-git-34ec2f11671, mixed mode, sharing, serial gc, linux-arm)
# Problematic frame:
# V  [libjvm.so+0xdf75f4]  StubInfo::process_stubgen_entry(StubGroup&, BlobId&, StubId&, EntryId&, char const*, BlobId, StubId, EntryId, int) [clone .constprop.0]+0x137
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to //core.117661)
#
# An error report file with more information is saved as:
# //hs_err_pid117661.log
#
#
qemu: uncaught target signal 6 (Aborted) - core dumped
environment: line 1: 117660 Aborted                 "$@"

It seems that the expected stub Stub Generator is atomic_load_long_stub. I guess the related code is around https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/arm/stubDeclarations_arm.hpp#L45~L49

2. copyright year should be updated for the following files

src/hotspot/cpu/arm/stubRoutinesCrypto_arm.cpp
src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp
src/hotspot/cpu/x86/stubGenerator_x86_64_adler.cpp
src/hotspot/cpu/x86/stubGenerator_x86_64_aes.cpp
src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp
src/hotspot/cpu/x86/stubGenerator_x86_64_cos.cpp
src/hotspot/cpu/x86/stubGenerator_x86_64_exp.cpp
src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp
src/hotspot/cpu/x86/stubGenerator_x86_64_ghash.cpp
src/hotspot/cpu/x86/stubGenerator_x86_64_log.cpp
src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp
src/hotspot/cpu/x86/stubGenerator_x86_64_poly1305.cpp
src/hotspot/cpu/x86/stubGenerator_x86_64_pow.cpp
src/hotspot/cpu/x86/stubGenerator_x86_64_sin.cpp
src/hotspot/cpu/x86/stubGenerator_x86_64_tan.cpp
src/hotspot/cpu/x86/stubGenerator_x86_64_tanh.cpp
src/hotspot/share/c1/c1_CodeStubs.hpp
src/hotspot/share/runtime/stubCodeGenerator.hpp

shqking avatar Jul 02 '25 09:07 shqking

@vnkozlov

I assume you will address s390 issue before integration.

Amit and Martin followed up on this offline. The problem apppears to be an independent, unrelated issue to do with save and restore of SP at the correct point and they are following it up separately. It only happens intermittently because it depends on what gets left in memory.

adinn avatar Jul 02 '25 09:07 adinn

@shqking Thank you for testing and reporting the issue and also for noticing the missing copyright updates.

The assertion is happening because the arm-specific stub declarations are inconsistent. The entry declaration for atomic_store_long_entry claims it belongs to stub atomic_load_long but the current stub is actually atomic_store_long.

I will push a patch to fix that and another one for the copyrights.

adinn avatar Jul 02 '25 10:07 adinn

@shqking I hope the arm build will now work ok.

adinn avatar Jul 02 '25 10:07 adinn

@vnkozlov Could you please rereview this?

adinn avatar Jul 02 '25 12:07 adinn

@shqking I hope the arm build will now work ok.

Thanks for your fix. Yes. It works now.

shqking avatar Jul 03 '25 00:07 shqking

@ashu-mehra Could you please review this PR. In particular, can you check the modifications I just made to the aotCodeCache blob save/load API. We need to save the blob using a blob id rather than a stub id.

n.b. I had to re-enable the AOTStubCaching flag to test these latest changes (currently, it is always disabled in aotCodeCache.cpp). None of the blob save/restore functionality is exercised without that override.

adinn avatar Jul 03 '25 14:07 adinn

@adinn yes, I have been reviewing it. Slowly making my way through the code :)

ashu-mehra avatar Jul 03 '25 15:07 ashu-mehra

Looks good to me. I have just a minor comment. Thanks for these changes. This should help in simplifying the AOTCodeAddressTable.

ashu-mehra avatar Jul 03 '25 19:07 ashu-mehra

@vnkozlov Could you please rereview this?

This looks fine. Please merge latest jdk so I can do additional testing with AOTStubCaching enabled.

vnkozlov avatar Jul 03 '25 19:07 vnkozlov