jdk
jdk copied to clipboard
8331953: ubsan: metaspaceShared.cpp:1305:57: runtime error: applying non-zero offset 12849152 to null pointer
Reported by @MBaesken at SAP. ubsan complains about this line:
const size_t ccs_begin_offset = align_up(base_address + archive_space_size,
class_space_alignment) - base_address;
base_address here is the wish address, with NULL being an explicitly allowed value that indicates "no preference". The line calculates the offset of the class space within the future combined CDS+class-space mapping. Ubsan complains about base_address being possibly NULL.
Ubsan is missing the point somewhat. The addition is not a problem, on our platforms at least.
However, it highlights a slight incorrectness (which it did not notice): base_address is the wish for the future base of the Klass range. That wish is not guaranteed to be fulfilled; the eventual start of the Klass range could be somewhere else. Therefore, calculating the class space offset with an alignment based on that wish address is wrong. It always worked in practice since base_address was always aligned to class_space_alignment (16MB).
Hence, the fix is simple: We just make the alignment requirement for the base address explicit. When running with class space, we now assert that base_address is aligned to class space alignment (as well as CDS core region alignment, but that is much smaller). Since base_address is calculated either from a hard-wired default or from the SharedBaseAddress user input, and both are ensured to be properly aligned, that assert should never fire.
Then, the offending calculation can be simplified by removing the base address from it.
Progress
- [x] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
- [x] Change must not contain extraneous whitespace
- [x] Commit message must refer to an issue
Warning
⚠️ Found leading lowercase letter in issue title for 8331953: ubsan: metaspaceShared.cpp:1305:57: runtime error: applying non-zero offset 12849152 to null pointer
Issue
- JDK-8331953: ubsan: metaspaceShared.cpp:1305:57: runtime error: applying non-zero offset 12849152 to null pointer (Bug - P4)
Reviewers
- Ioi Lam (@iklam - Reviewer)
Reviewing
Using git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19152/head:pull/19152
$ git checkout pull/19152
Update a local copy of the PR:
$ git checkout pull/19152
$ git pull https://git.openjdk.org/jdk.git pull/19152/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 19152
View PR using the GUI difftool:
$ git pr show -t 19152
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19152.diff
Webrev
:wave: Welcome back stuefe! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.
@tstuefe This change now passes all automated pre-integration checks.
ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.
After integration, the commit message for the final commit will be:
8331953: ubsan: metaspaceShared.cpp:1305:57: runtime error: applying non-zero offset 12849152 to null pointer
Reviewed-by: iklam, mbaesken
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.
At the time when this comment was updated there had been 100 new commits pushed to the master branch:
- f9f8d0b48057a02923e36c8e11286b57cc72279e: 8332101: Add an
@sincetoStandardOperation:REMOVEinjdk.dynalink - f398cd225012694a586e528936159b6df7b1586c: 8331575: C2: crash when ConvL2I is split thru phi at LongCountedLoop
- 96c5c3fe75103dc45bc1c3ccce0ab36303121a60: 8329998: Remove double initialization for parts of small TypeArrays in ZObjArrayAllocator
- ee4a9d34827166ff9ac04e2375058fdc08e43194: 8321622: ClassFile.verify(byte[] bytes) throws unexpected ConstantPoolException, IAE
- ab8d7b0cedfaae124262325cd1d4b59cef996d85: 8324517: C2: crash in compiled code because of dependency on removed range check CastIIs
- fe8a2aff3129b515c2a0f3ab96f5e3ad6cef7b70: 8307778: com/sun/jdi/cds tests fail with jtreg's Virtual test thread factory
- 95f79c678737fb8de9ed45c516761d4d818869ef: 8332253: Linux arm32 build fails after 8292591
- b687aa550837830b38f0f0faa69c353b1e85219c: 8332176: Refactor ClassListParser::parse()
- 4083255440cfbf39b9683ea88a433d71ec6111e7: 8316138: Add GlobalSign 2 TLS root certificates
- 43b109b111e77d0f7b302debc0d76e4ac7c9ac56: 8330066: HeapDumpPath and HeapDumpGzipLevel VM options do not mention HeapDumpBeforeFullGC and HeapDumpAfterFullGC
- ... and 90 more: https://git.openjdk.org/jdk/compare/2baacfc16916220846743c6e49a99a6c41cac510...master
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.
➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.
@tstuefe The following label will be automatically applied to this pull request:
hotspot-runtime
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.
@MBaesken , and possibly @iklam or @calvinccheung ?
I added the PR to our build+test queue .
We see now errors like this on Linux ppc64le with the patch added.
# Internal Error (/jdk/src/hotspot/share/cds/metaspaceShared.cpp:1295), pid=7002, tid=7006
# assert(is_aligned(base_address, base_address_alignment)) failed: Archive base address unaligned: 0x0000000000010000, needs alignment: 16777216.
#
# JRE version: (23.0) (fastdebug build )
# Java VM: OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.jenkinsi.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-ppc64le)
# Problematic frame:
# V [libjvm.so+0x1546818] MetaspaceShared::reserve_address_space_for_archives(FileMapInfo*, FileMapInfo*, bool, ReservedSpace&, ReservedSpace&, ReservedSpace&)+0x258
Current thread (0x00007fff8c02ca00): JavaThread "Unknown thread" [_thread_in_vm, id=7006, stack(0x00007fff92350000,0x00007fff92550000) (2048K)]
Stack: [0x00007fff92350000,0x00007fff92550000], sp=0x00007fff9254dd60, free space=2039k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x1546818] MetaspaceShared::reserve_address_space_for_archives(FileMapInfo*, FileMapInfo*, bool, ReservedSpace&, ReservedSpace&, ReservedSpace&)+0x258 (metaspaceShared.cpp:1295)
V [libjvm.so+0x154937c] MetaspaceShared::map_archives(FileMapInfo*, FileMapInfo*, bool)+0xcc (metaspaceShared.cpp:1050)
V [libjvm.so+0x1549d38] MetaspaceShared::initialize_runtime_shared_and_meta_spaces()+0x2c8 (metaspaceShared.cpp:931)
V [libjvm.so+0x153bd68] Metaspace::global_initialize()+0x178 (metaspace.cpp:714)
V [libjvm.so+0x1c203d8] universe_init()+0x168 (universe.cpp:866)
V [libjvm.so+0xf60130] init_globals()+0x90 (init.cpp:128)
V [libjvm.so+0x1bd8694] Threads::create_vm(JavaVMInitArgs*, bool*)+0x364 (threads.cpp:553)
V [libjvm.so+0x114c4b8] JNI_CreateJavaVM+0x98 (jni.cpp:3581)
C [libjli.so+0x5e94] JavaMain+0xd4 (java.c:1550)
C [libjli.so+0xbc38] ThreadJavaMain+0x18 (java_md.c:653)
C [libpthread.so.0+0xaa68] start_thread+0x108
We see now errors like this on Linux ppc64le with the patch added.
Thanks, Matthias, the new version should not have this problem.
@iklam, could you review again? The only new hunk came with this commit: https://github.com/openjdk/jdk/pull/19152/commits/a38441d0e8357c755a3f82a9e4db2e995b00a196 .
I now make sure that the SharedBaseAddress passed in during archive generation is correctly aligned to both archive core alignment and metaspace alignment. I thought we did this already, but apparently not.
We see now errors like this on Linux ppc64le with the patch added.
# Internal Error (/jdk/src/hotspot/share/cds/metaspaceShared.cpp:1295), pid=7002, tid=7006 # assert(is_aligned(base_address, base_address_alignment)) failed: Archive base address unaligned: 0x0000000000010000, needs alignment: 16777216.
This looks curious. Why would base_address be 0x0000000000010000? Is that value contained in static_mapinfo->requested_base_address()? I am wondering is there's a rounding/alignment bug that leads up to this value on ppc64le.
We see now errors like this on Linux ppc64le with the patch added.
# Internal Error (/jdk/src/hotspot/share/cds/metaspaceShared.cpp:1295), pid=7002, tid=7006 # assert(is_aligned(base_address, base_address_alignment)) failed: Archive base address unaligned: 0x0000000000010000, needs alignment: 16777216.This looks curious. Why would
base_addressbe0x0000000000010000? Is that value contained instatic_mapinfo->requested_base_address()? I am wondering is there's a rounding/alignment bug that leads up to this value on ppc64le.
No, that's the test. It dumps archives with different SharedBaseAddress, then checks that the archives can be loaded. One of the SharedBaseAddress values is 64K:
https://github.com/openjdk/jdk/blob/95f79c678737fb8de9ed45c516761d4d818869ef/test/hotspot/jtreg/runtime/cds/SharedBaseAddress.java#L76
It used to work before this patch, though subtly broken, because we did not require the archive base to be aligned to class space alignment. Instead, we aligned the class space within the mapping (so, gap between archive and class space can be higher). That, however, has the problem described in the description. We calculate the class space offset in the mapping before actually mapping, based on the SharedAddressBase given at archive creation. If the mapping fails (likely with a value aligned to 64K) and the real mapping later is aligned at a different alignment (likely with 64K), either class space start will be misaligned or it gets re-aligned but then the mapping size may be too small.
SAP reported that all tests are green now.
@MBaesken could you give this a quick thumbs up?
Thanks @MBaesken and @iklam !
/integrate
Going to push as commit 910d77becd15580296687b00fed085ab106cb2eb.
Since your change was applied there have been 100 commits pushed to the master branch:
- f9f8d0b48057a02923e36c8e11286b57cc72279e: 8332101: Add an
@sincetoStandardOperation:REMOVEinjdk.dynalink - f398cd225012694a586e528936159b6df7b1586c: 8331575: C2: crash when ConvL2I is split thru phi at LongCountedLoop
- 96c5c3fe75103dc45bc1c3ccce0ab36303121a60: 8329998: Remove double initialization for parts of small TypeArrays in ZObjArrayAllocator
- ee4a9d34827166ff9ac04e2375058fdc08e43194: 8321622: ClassFile.verify(byte[] bytes) throws unexpected ConstantPoolException, IAE
- ab8d7b0cedfaae124262325cd1d4b59cef996d85: 8324517: C2: crash in compiled code because of dependency on removed range check CastIIs
- fe8a2aff3129b515c2a0f3ab96f5e3ad6cef7b70: 8307778: com/sun/jdi/cds tests fail with jtreg's Virtual test thread factory
- 95f79c678737fb8de9ed45c516761d4d818869ef: 8332253: Linux arm32 build fails after 8292591
- b687aa550837830b38f0f0faa69c353b1e85219c: 8332176: Refactor ClassListParser::parse()
- 4083255440cfbf39b9683ea88a433d71ec6111e7: 8316138: Add GlobalSign 2 TLS root certificates
- 43b109b111e77d0f7b302debc0d76e4ac7c9ac56: 8330066: HeapDumpPath and HeapDumpGzipLevel VM options do not mention HeapDumpBeforeFullGC and HeapDumpAfterFullGC
- ... and 90 more: https://git.openjdk.org/jdk/compare/2baacfc16916220846743c6e49a99a6c41cac510...master
Your commit was automatically rebased without conflicts.
@tstuefe Pushed as commit 910d77becd15580296687b00fed085ab106cb2eb.
:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.