openj9 icon indicating copy to clipboard operation
openj9 copied to clipboard

JDK17 openj9 (semeru) crash when running on alpine 3.20.2

Open yaakov-berkovitch opened this issue 1 year ago • 11 comments

Java -version output

openjdk 17.0.12 2024-07-16 IBM Semeru Runtime Open Edition 17.0.12.0 (build 17.0.12+7) Eclipse OpenJ9 VM 17.0.12.0 (build openj9-0.46.0, JRE 17 Linux amd64-64-Bit Compressed References 20240716_818 (JIT enabled, AOT enabled) OpenJ9 - 1a6f6128aa OMR - 840a9adba JCL - 784bd66222d based on jdk-17.0.12+7)

Summary of problem

Segmentation error occurred when running on alpine 3.20.2. It works fine when running with alpine v3.17.2. javacore.20240825.071057.8.0006.txt.zip

485fc6bfcb43:~$ cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.20.2
PRETTY_NAME="Alpine Linux v3.20"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://gitlab.alpinelinux.org/alpine/aports/-/issues"

Diagnostic files

If a crash (gpf, assert, abort, etc) or OutOfMemory condition, please provide the diagnostic files produced (javacore, Snap, jitdump, core). The smaller files can be attached to this Issue. The core should be compressed and made available via a file sharing service (Box, Google Drive, etc). If there are privacy concerns please direct email the files to an OpenJ9 committer.

stderr console as following.

xmm1=56414a5f394a4e45 (f: 961171008.000000, d: 3.172462e+107)
xmm2=6c00726964646165 (f: 1684300160.000000, d: 1.730261e+212)
xmm3=0000726964646165 (f: 1684300160.000000, d: 6.215197e-310)
xmm4=78756e696c2d646c (f: 1814914176.000000, d: 1.811526e+272)
xmm5=0000003000000020 (f: 32.000000, d: 1.018558e-312)
xmm6=000000000007daf8 (f: 514808.000000, d: 2.543489e-318)
xmm7=0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm8=0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm9=00ffffffff0000ff (f: 4278190336.000000, d: 7.291122e-304)
xmm10=0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm11=0000015200000151 (f: 337.000000, d: 7.172346e-312)
xmm12=0000013d00000140 (f: 320.000000, d: 6.726727e-312)
xmm13=000001380000013f (f: 319.000000, d: 6.620627e-312)
xmm14=0000000008001800 (f: 134223872.000000, d: 6.631540e-316)
xmm15=000001420000013b (f: 315.000000, d: 6.832826e-312)
Target=2_90_20240716_818 (Linux 5.10.219-208.866.amzn2.x86_64)
CPU=amd64 (16 logical CPUs) (0xf8097f000 RAM)
----------- Stack Backtrace -----------
 (0x000000000004C0D0 [<unknown>+0x0])
---------------------------------------
JVMDUMP039I Processing dump event "gpf", detail "" at 2024/08/25 15:55:09 - please wait.
JVMDUMP032I JVM requested System dump using '/home/afa/core.20240825.155509.8.0001.dmp' in response to an event
JVMDUMP010I System dump written to /home/afa/core.20240825.155509.8.0001.dmp
JVMDUMP032I JVM requested Java dump using '/home/afa/javacore.20240825.155509.8.0002.txt' in response to an event
JVMDUMP010I Java dump written to /home/afa/javacore.20240825.155509.8.0002.txt
JVMDUMP032I JVM requested Snap dump using '/home/afa/Snap.20240825.155509.8.0003.trc' in response to an event
JVMDUMP010I Snap dump written to /home/afa/Snap.20240825.155509.8.0003.trc
JVMDUMP032I JVM requested JIT dump using '/home/afa/jitdump.20240825.155509.8.0004.dmp' in response to an event
JVMDUMP051I JIT dump occurred in 'main' thread 0x0000000000017000
JVMDUMP010I JIT dump written to /home/afa/jitdump.20240825.155509.8.0004.dmp
JVMDUMP013I Processed dump event "gpf", detail "".
JVMDUMP039I Processing dump event "abort", detail "" at 2024/08/25 15:55:09 - please wait.
JVMDUMP032I JVM requested System dump using '/home/afa/core.20240825.155509.8.0005.dmp' in response to an event
JVMDUMP010I System dump written to /home/afa/core.20240825.155509.8.0005.dmp
JVMDUMP032I JVM requested Java dump using '/home/afa/javacore.20240825.155509.8.0006.txt' in response to an event
JVMDUMP010I Java dump written to /home/afa/javacore.20240825.155509.8.0006.txt
JVMDUMP032I JVM requested Snap dump using '/home/afa/Snap.20240825.155509.8.0007.trc' in response to an event
JVMDUMP010I Snap dump written to /home/afa/Snap.20240825.155509.8.0007.trc
JVMDUMP032I JVM requested JIT dump using '/home/afa/jitdump.20240825.155509.8.0008.dmp' in response to an event
JVMDUMP051I JIT dump occurred in 'main' thread 0x0000000000017000
JVMDUMP010I JIT dump written to /home/afa/jitdump.20240825.155509.8.0008.dmp
JVMDUMP013I Processed dump event "abort", detail "".

yaakov-berkovitch avatar Aug 25 '24 16:08 yaakov-berkovitch

1XMCURTHDINFO  Current thread
3XMTHREADINFO      "main" J9VMThread:0x0000000000017000, omrthread_t:0x00007F0E0C01C090, java/lang/Thread:0x000000070004B258, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x1, isDaemon:false)
3XMJAVALTHRCCL            jdk/internal/loader/ClassLoaders$AppClassLoader(0x0000000700043CD0)
3XMTHREADINFO1            (native thread ID:0xC, native priority:0x5, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00000020)
3XMTHREADINFO2            (native stack address range from:0x00007F0E112E5000, to:0x00007F0E11365000, size:0x80000)
3XMCPUTIME               CPU usage total: 0.972463676 secs, current category="System-JVM"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=892960 (0xDA020)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at jdk/crypto/jniprovider/NativeCrypto.loadCrypto(Native Method)
4XESTACKTRACE                at jdk/crypto/jniprovider/NativeCrypto.loadCryptoLibraries(NativeCrypto.java:87)
4XESTACKTRACE                at jdk/crypto/jniprovider/NativeCrypto.lambda$new$0(NativeCrypto.java:105)
4XESTACKTRACE                at jdk/crypto/jniprovider/NativeCrypto$$Lambda$72/0x000000000c98dd60.run(Bytecode PC:0)
4XESTACKTRACE                at java/security/AccessController.doPrivileged(AccessController.java:692)
4XESTACKTRACE                at jdk/crypto/jniprovider/NativeCrypto.<init>(NativeCrypto.java:105)
4XESTACKTRACE                at jdk/crypto/jniprovider/NativeCrypto$InstanceHolder.<clinit>(NativeCrypto.java:72)
4XESTACKTRACE                at jdk/crypto/jniprovider/NativeCrypto.getVersionIfAvailable(NativeCrypto.java:131)
4XESTACKTRACE                at jdk/crypto/jniprovider/NativeCrypto.isAllowedAndLoaded(NativeCrypto.java:116)
4XESTACKTRACE                at sun/security/provider/SunEntries.<init>(SunEntries.java:281)
4XESTACKTRACE                at sun/security/provider/Sun.<init>(Sun.java:56)
5XESTACKTRACE                   (entered lock: sun/security/jca/ProviderConfig@0x00000007FFEC2088, entry count: 1)
4XESTACKTRACE                at sun/security/jca/ProviderConfig.getProvider(ProviderConfig.java:198)
4XESTACKTRACE                at sun/security/jca/ProviderList.getProvider(ProviderList.java:293)
4XESTACKTRACE                at sun/security/jca/ProviderList$3.get(ProviderList.java:183)
4XESTACKTRACE                at sun/security/jca/ProviderList$3.get(ProviderList.java:178)
4XESTACKTRACE                at java/util/AbstractList$Itr.next(AbstractList.java:371)
4XESTACKTRACE                at java/security/SecureRandom.getDefaultPRNG(SecureRandom.java:279)
4XESTACKTRACE                at java/security/SecureRandom.<init>(SecureRandom.java:233)
5XESTACKTRACE                   (entered lock: java/lang/Object@0x00000007FFEBFD38, entry count: 1)
4XESTACKTRACE                at java/rmi/server/UID.<init>(UID.java:112)
4XESTACKTRACE                at java/rmi/server/ObjID.<clinit>(ObjID.java:88)
4XESTACKTRACE                at sun/rmi/transport/LiveRef.<init>(LiveRef.java:74)
4XESTACKTRACE                at sun/management/jmxremote/ConnectorBootstrap$PermanentExporter.exportObject(ConnectorBootstrap.java:201)
4XESTACKTRACE                at javax/management/remote/rmi/RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:153)
4XESTACKTRACE                at javax/management/remote/rmi/RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:138)
5XESTACKTRACE                   (entered lock: javax/management/remote/rmi/RMIConnectorServer@0x00000007FFE9DB48, entry count: 1)
4XESTACKTRACE                at javax/management/remote/rmi/RMIConnectorServer.start(RMIConnectorServer.java:453)
4XESTACKTRACE                at sun/management/jmxremote/ConnectorBootstrap.exportMBeanServer(ConnectorBootstrap.java:839)
5XESTACKTRACE                   (entered lock: sun/management/jmxremote/ConnectorBootstrap@0x0000000700028B40, entry count: 1)
4XESTACKTRACE                at sun/management/jmxremote/ConnectorBootstrap.startRemoteConnectorServer(ConnectorBootstrap.java:480)
4XESTACKTRACE                at jdk/internal/agent/Agent.startAgent(Agent.java:447)
4XESTACKTRACE                at jdk/internal/agent/Agent.startAgent(Agent.java:599)
4XESTACKTRACE                at java/lang/System.startSNMPAgent(Native Method)
4XESTACKTRACE                at java/lang/J9VMInternals.threadCompleteInitialization(J9VMInternals.java:75)
4XESTACKTRACE                at java/lang/J9VMInternals.completeInitialization(J9VMInternals.java:88)
3XMTHREADINFO3           Native callstack:
4XENATIVESTACK               protectedBacktrace+0x12 (0x00007F0E10CE1DF2 [libj9prt29.so+0x25df2])
4XENATIVESTACK               omrsig_protect+0x239 (0x00007F0E10CE63C9 [libj9prt29.so+0x2a3c9])
4XENATIVESTACK               omrintrospect_backtrace_thread_raw+0xbe (0x00007F0E10CE22CE [libj9prt29.so+0x262ce])
4XENATIVESTACK               omrsig_protect+0x239 (0x00007F0E10CE63C9 [libj9prt29.so+0x2a3c9])
4XENATIVESTACK               omrintrospect_backtrace_thread+0x87 (0x00007F0E10CE1C77 [libj9prt29.so+0x25c77])
4XENATIVESTACK               setup_native_thread+0x1e3 (0x00007F0E10CE2C43 [libj9prt29.so+0x26c43])
4XENATIVESTACK               omrintrospect_threads_startDo_with_signal+0x41f (0x00007F0E10CE3D9F [libj9prt29.so+0x27d9f])
4XENATIVESTACK               omrsig_protect+0x239 (0x00007F0E10CE63C9 [libj9prt29.so+0x2a3c9])
4XENATIVESTACK               _ZN18JavaCoreDumpWriter28writeThreadsWithNativeStacksEv+0x430 (0x00007F0E10AF7F50 [libj9dmp29.so+0x19f50])
4XENATIVESTACK               protectedWriteThreadsWithNativeStacks+0xd (0x00007F0E10AF87FD [libj9dmp29.so+0x1a7fd])
4XENATIVESTACK               omrsig_protect+0x239 (0x00007F0E10CE63C9 [libj9prt29.so+0x2a3c9])
4XENATIVESTACK               _ZN18JavaCoreDumpWriter18writeThreadSectionEv+0x14b (0x00007F0E10AF4BEB [libj9dmp29.so+0x16beb])
4XENATIVESTACK               protectedWriteSection+0x1d (0x00007F0E10AEFA0D [libj9dmp29.so+0x11a0d])
4XENATIVESTACK               omrsig_protect+0x239 (0x00007F0E10CE63C9 [libj9prt29.so+0x2a3c9])
4XENATIVESTACK               _ZN18JavaCoreDumpWriterC2EPKcP16J9RASdumpContextP14J9RASdumpAgent+0x3f5 (0x00007F0E10AF0EF5 [libj9dmp29.so+0x12ef5])
4XENATIVESTACK               runJavadump+0x1c (0x00007F0E10AFB13C [libj9dmp29.so+0x1d13c])
4XENATIVESTACK               doJavaDump+0x42 (0x00007F0E10AE3182 [libj9dmp29.so+0x5182])
4XENATIVESTACK               protectedDumpFunction+0x15 (0x00007F0E10AE27C5 [libj9dmp29.so+0x47c5])
4XENATIVESTACK               omrsig_protect+0x239 (0x00007F0E10CE63C9 [libj9prt29.so+0x2a3c9])
4XENATIVESTACK               runDumpFunction+0x62 (0x00007F0E10AE5FB2 [libj9dmp29.so+0x7fb2])
4XENATIVESTACK               runDumpAgent+0x15d (0x00007F0E10AE613D [libj9dmp29.so+0x813d])
4XENATIVESTACK               triggerDumpAgents+0x615 (0x00007F0E10AFD955 [libj9dmp29.so+0x1f955])
4XENATIVESTACK               abortHandler+0xe1 (0x00007F0E10AE7F81 [libj9dmp29.so+0x9f81])
4XENATIVESTACK                (0x00007F0E113BE320 [libc.so.6+0x42320])
4XENATIVESTACK                (0x00007F0E1140B0C6 [libc.so.6+0x8f0c6])
4XENATIVESTACK               raise+0x16 (0x00007F0E113BE276 [libc.so.6+0x42276])
4XENATIVESTACK               abort+0xd7 (0x00007F0E113A87B7 [libc.so.6+0x2c7b7])
4XENATIVESTACK               pool_newElement.cold+0x0 (0x00007F0E10CC8F15 [libj9prt29.so+0xcf15])
4XENATIVESTACK                (0x00007F0E113BE320 [libc.so.6+0x42320])
4XENATIVESTACK                (0x000000000004C070 [<unknown>+0x0])

gacholio avatar Aug 25 '24 19:08 gacholio

Crash occur in a native method VM flags:0000000000040000

dmitripivkine avatar Aug 26 '24 13:08 dmitripivkine

@jasonkatonica FYI

dmitripivkine avatar Aug 26 '24 13:08 dmitripivkine

@yaakov-berkovitch can the crash be reproduced with -Djdk.nativeCrypto=false?

JasonFengJ9 avatar Aug 26 '24 15:08 JasonFengJ9

@KostasTsiounis Can you take a look, Jason K is away.

tajila avatar Aug 26 '24 17:08 tajila

This is a bit weird. The failure seems to occur in jdk/crypto/jniprovider/NativeCrypto.loadCrypto which is responsible for loading OpenSSL. This method, however, is not in jdk/crypto/jniprovider/NativeCrypto.loadCryptoLibraries(NativeCrypto.java:87) but rather line 89. Our native code is rather loaded in line 87.

So, I have few questions:

  • Are the runs in both Alpine versions using the same JDK image?
  • Is it actually a JDK or a JRE? Also, is it an official Semeru download?
  • What OpenSSL version is present in either case?

Also, could you please run with -Djdk.nativeCryptoTrace=true in both alpine version and provide the output?

KostasTsiounis avatar Aug 26 '24 18:08 KostasTsiounis

@KostasTsiounis thanks for helping me with this issue - Here are the information you asked me:

  • yes, the 2 alpines versions are running the same JDK images
  • it's a JDK from the official Semeru : open-11.0.23_9-jdk-centos7. We are building our image using multi stage builds, and from this stage we are taking only part of the JDK distribution to reduce the size of the image
  • openssl version: OpenSSL 3.3.1 4 Jun 2024 (Library: OpenSSL 3.3.1 4 Jun 2024)

I added the , but there are no so much output generated by this flag except this: MessageDigest native crypto implementation enabled. This is the more complete output:

MessageDigest native crypto implementation enabled.
Unhandled exception
Type=Segmentation error vmState=0x00040000
J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
Handler1=00007F819179BAC0 Handler2=00007F81916F2730 InaccessibleAddress=000000000004C0D0
RDI=000000000000002B RSI=00007FFC8F0C2038 RAX=00007F81719AA550 RBX=00007F81719AA558
RCX=00007F8171A068A0 RDX=00007F818C020E30 R8=00007F818C812880 R9=FFFA320300000000
R10=000000005F8BFBFF R11=000000000000000D R12=00007FFC8F0C2038 R13=00007F818C020E30
R14=00007F81719AA558 R15=0000000000000000
RIP=000000000004C0D0 GS=0000 FS=0000 RSP=00007F8191D2D3E8
EFlags=0000000000010246 CS=0033 RBP=000000000000002B ERR=0000000000000015
TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=000000000004C0D0
xmm0 0000000000050657 (f: 329303.000000, d: 1.626973e-318)
xmm1 56414a5f394a4e45 (f: 961171008.000000, d: 3.172462e+107)
xmm2 6c00726964646165 (f: 1684300160.000000, d: 1.730261e+212)
xmm3 0000726964646165 (f: 1684300160.000000, d: 6.215197e-310)
xmm4 78756e696c2d646c (f: 1814914176.000000, d: 1.811526e+272)
xmm5 00007f8191994050 (f: 2442739712.000000, d: 6.926527e-310)
xmm6 00007f818c7f4754 (f: 2357151488.000000, d: 6.926523e-310)
xmm7 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm8 b7590200bb0b0000 (f: 3138060288.000000, d: -4.485558e-42)
xmm9 000000ffffffffff (f: 4294967296.000000, d: 5.432309e-312)
xmm10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm11 0000015200000151 (f: 337.000000, d: 7.172346e-312)
xmm12 0000013d00000140 (f: 320.000000, d: 6.726727e-312)
xmm13 000001380000013f (f: 319.000000, d: 6.620627e-312)
xmm14 0000000008001800 (f: 134223872.000000, d: 6.631540e-316)
xmm15 000001420000013b (f: 315.000000, d: 6.832826e-312)
Target=2_90_20240522_1104 (Linux 5.10.219-208.866.amzn2.x86_64)
CPU=amd64 (16 logical CPUs) (0xf8097f000 RAM)

Also, as requested by @JasonFengJ9 , I added -Djdk.nativeCrypto=false . No crash occurred and the application started to run.

yaakov-berkovitch avatar Aug 27 '24 09:08 yaakov-berkovitch

It makes sense that -Djdk.nativeCrypto=false would work since it disables this whole path that causes the issue.

We have had similar issues with native libraries in the JRE in the past, which were caused by adding --strip-debug as part of the linking stage. Are you by any chance using something similar to reduce the size of the utilized JDK? If the answer is yes, you can try copying the original lib/libjncrypto.so and check whether that fixes the problem.

KostasTsiounis avatar Aug 27 '24 13:08 KostasTsiounis

We are not using --strip-debug, but we are creating a custom JRE. Here is the the first stage from the Dockerfile:

ARG ALPINE_DOCKER_IMAGE_TAG
ARG JAVA_DOCKER_IMAGE_TAG

FROM ibm-semeru-runtimes:${JAVA_DOCKER_IMAGE_TAG} as packager

# First stage: JDK with modules required for Spring Boot
RUN /opt/java/openjdk/bin/jlink \
    --module-path /opt/java/openjdk/jmods \
    --verbose \
    --add-modules java.base,java.logging,java.compiler,java.xml,java.sql,jdk.httpserver,jdk.unsupported,java.naming,java.desktop,java.management,jdk.crypto.cryptoki,jdk.crypto.ec,java.security.jgss,java.instrument,jdk.management.agent,jdk.localedata,openj9.sharedclasses,jdk.jartool \
    --compress 2 \
    --no-header-files \
    --output /jdk-minimal
# Also need the jar utility to uncompress the application jar files
# and the java library for debugging
RUN cp /opt/java/openjdk/bin/jar /jdk-minimal/bin/jar \
        && cp /opt/java/openjdk/lib/libjdwp.so /jdk-minimal/lib/libjdwp.so \
        && cp /opt/java/openjdk/lib/libdt_socket.so /jdk-minimal/lib/libdt_socket.so

# Second stage, add only our custom jdk distro and our app
FROM alpine:${ALPINE_DOCKER_IMAGE_TAG}

During build we pass as follows:

  • JAVA_DOCKER_IMAGE_TAG=open-17.0.12_7-jdk-jammy
  • ALPINE_DOCKER_IMAGE_TAG=3.20.2

yaakov-berkovitch avatar Aug 27 '24 13:08 yaakov-berkovitch

It could be the --compress 2. Could you try copying the lib/libjncrypto.so back after the jlink?

KostasTsiounis avatar Aug 27 '24 14:08 KostasTsiounis

I added copy the lib/libjncrypto.so as you suggested after the jlink but still crashed. I didn't remove the --compress 2. Should I give a try ?

yaakov-berkovitch avatar Aug 27 '24 15:08 yaakov-berkovitch

Hey I just noticed this issue... I thought OpenJ9 wasn't officially supported on Alpine Linux. Great news if it is !

avermeer avatar Aug 30 '24 17:08 avermeer

It's not.

pshipton avatar Aug 30 '24 18:08 pshipton

I added copy the lib/libjncrypto.so as you suggested after the jlink but still crashed. I didn't remove the --compress 2. Should I give a try ?

Yeah, that would give us some further insight.

KostasTsiounis avatar Sep 03 '24 15:09 KostasTsiounis

Ok - will make a try and update.

BTW, to make this issue not a blocker for us in DEV not for PROD, we are running with -Djdk.nativeCrypto=false. Can you confirm it's a valid workaround for now and there is no risk ?

yaakov-berkovitch avatar Sep 04 '24 10:09 yaakov-berkovitch

Ok - will make a try and update.

BTW, to make this issue not a blocker for us in DEV not for PROD, we are running with -Djdk.nativeCrypto=false. Can you confirm it's a valid workaround for now and there is no risk ?

Yes. This flag totally circumvents the problematic path.

KostasTsiounis avatar Sep 04 '24 14:09 KostasTsiounis