openj9 icon indicating copy to clipboard operation
openj9 copied to clipboard

SPECjbb2015_2 FAILED due to OutOfMemoryError or NullPointerException: Cannot load from object array because "this.objs" is null

Open JasonFengJ9 opened this issue 4 months ago • 14 comments

Failure link

From internal Test_openjdk25_j9_sanity.functional_x86-64_windows_testList_1 (win22x86-rtp-rt7-1)

openjdk version "25" 2025-09-16 LTS
IBM Semeru Runtime Open Edition 25.0-m1 (build 25+36-LTS)
Eclipse OpenJ9 VM 25.0-m1 (build v0.55.0-release-a3a9ca8b80, JRE 25 Windows Server 2022 amd64-64-Bit Compressed References 20250916_9 (JIT enabled, AOT enabled)
OpenJ9   - a3a9ca8b80
OMR      - 3aed163cd
JCL      - 1da1109ce78 based on jdk-25+36)

Rerun in Grinder - Change TARGET to run only the failed test targets

Optional info

Failure output (captured from console output)

[2025-08-25T21:28:02.798Z] Running test SPECjbb2015_2 ...
[2025-08-25T21:28:02.798Z] ===============================================
[2025-08-25T21:28:02.798Z] SPECjbb2015_2 Start Time: Mon Aug 25 14:28:01 2025 Epoch Time (ms): 1756157281931
[2025-08-25T21:28:02.800Z] variation: -Xgcpolicy:balanced -Xnocompressedrefs -Xmx2500M -Xmnx128M -XXgc:fvtest_forceCopyForwardHybridMarkCompactRatio=20 -XXgc:disableVirtualLargeObjectHeap
[2025-08-25T21:28:04.386Z] JVM_OPTIONS:  -Xgcpolicy:balanced -Xnocompressedrefs -Xmx2500M -Xmnx128M -XXgc:fvtest_forceCopyForwardHybridMarkCompactRatio=20 -XXgc:disableVirtualLargeObjectHeap 

[2025-08-26T00:10:46.059Z] JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "Java heap space" at 2025/08/25 17:10:43 - please wait.

50x internal Grinder - in progress, one failure at win19x64-fips1-1

JasonFengJ9 avatar Aug 27 '25 20:08 JasonFengJ9

win19x64-fips1-1 seems to have a problem. The following message occurred 3000+ times in the console.

seems to be removed or offline (java.io.IOException: cannot find current thread); will wait for 5 min 0 sec for it to come back online

llxia avatar Sep 03 '25 13:09 llxia

Relaunched 50x internal Grinder => 1/50 failed

Failed Grinder on win19x86-svl-rt1-1- has 4000+ seems to be removed or offline (java.io.IOException: cannot find current thread); will wait for 5 min 0 sec for it to come back online message

llxia avatar Sep 03 '25 13:09 llxia

openjdk25_j9_sanity.functional_x86-64_windows(win22x86-rtp-rt7-1)

[2025-09-22T19:41:29.993Z] Running test SPECjbb2015_2 ...
[2025-09-22T19:41:29.993Z] ===============================================
[2025-09-22T19:41:30.882Z] SPECjbb2015_2 Start Time: Mon Sep 22 12:41:29 2025 Epoch Time (ms): 1758570089962
[2025-09-22T19:41:30.882Z] variation: -Xgcpolicy:balanced -Xnocompressedrefs -Xmx2500M -Xmnx128M -XXgc:fvtest_forceCopyForwardHybridMarkCompactRatio=20 -XXgc:disableVirtualLargeObjectHeap
[2025-09-22T19:41:30.882Z] JVM_OPTIONS:  -Xgcpolicy:balanced -Xnocompressedrefs -Xmx2500M -Xmnx128M -XXgc:fvtest_forceCopyForwardHybridMarkCompactRatio=20 -XXgc:disableVirtualLargeObjectHeap 

[2025-09-22T23:34:25.874Z] ???????????????.<Mon Sep 22 16:34:24 PDT 2025> org.spec.jbb.controller: TRANSITION: settling, (rIR:aIR:PR = 0:0:15) (tPR = 316) [OK] 
[2025-09-22T23:34:48.699Z] ????????????????????.JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "Java heap space" at 2025/09/22 16:34:46 - please wait.
[2025-09-22T23:34:48.699Z] JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "Java heap space" at 2025/09/22 16:34:46 - please wait.

[2025-09-22T23:37:59.437Z] java.lang.OutOfMemoryError: Java heap space
[2025-09-22T23:37:59.437Z] 	at org.spec.jbb.core.collections.CollectionUtils.elementsAsList(CollectionUtils.java:152)
[2025-09-22T23:37:59.437Z] 	at java.lang.OutOfMemoryErrororg.spec.jbb.hq.tx.SupermarketAudit: .Java heap space
[2025-09-22T23:37:59.437Z] analyze(SupermarketAudit.java:134)
[2025-09-22T23:37:59.437Z] 	at org.spec.jbb.hq.tx.SupermarketAudit.generate(SupermarketAudit.java:78)
[2025-09-22T23:37:59.437Z] 	at org.spec.jbb.hq.tx.BusinessReportTransaction.execute(BusinessReportTransaction.java:34)
[2025-09-22T23:37:59.437Z] 	at org.spec.jbb.core.tx.SimpleTransactionExecutor.execute(SimpleTransactionExecutor.java:37)
[2025-09-22T23:37:59.438Z] 	at org.spec.jbb.hq.HQ.execute(HQ.java:147)
[2025-09-22T23:37:59.438Z] 	at org.spec.jbb.core.threadpools.AbstractPool.processOne(AbstractPool.java:71)
[2025-09-22T23:37:59.438Z] 	at org.spec.jbb.core.threadpools.ForkJoinBatchTask.exec(ForkJoinBatchTask.java:87)
[2025-09-22T23:37:59.438Z] 	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:511)
[2025-09-22T23:37:59.438Z] 	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1450)

[2025-09-23T01:15:11.681Z] <Mon Sep 22 18:15:11 PDT 2025> org.spec.jbb: Failed to stop executor service java.util.concurrent.Executors$AutoShutdownDelegatedExecutorService@1a2f41e9, trying again
[2025-09-23T01:15:14.380Z] Exception in thread "Controller" java.lang.OutOfMemoryError: Java heap space
[2025-09-23T01:15:14.380Z] 	at java.base/java.lang.Class.getConstructorImpl(Native Method)
[2025-09-23T01:15:14.380Z] 	at java.base/java.lang.Class.getConstructor(Class.java:578)
[2025-09-23T01:15:14.380Z] 	at java.base/java.util.ResourceBundle$ResourceBundleProviderHelper.newResourceBundle(ResourceBundle.java:3515)

50x grinder - NPE occurred at https://github.com/eclipse-openj9/openj9/issues/22525#issuecomment-3330242168, OutOfMemoryError occurred at Grinder_iteration_3 & Grinder_iteration_4

JasonFengJ9 avatar Sep 24 '25 15:09 JasonFengJ9

@dmitripivkine pls take a look.

pshipton avatar Sep 24 '25 17:09 pshipton

NullPointerException: Cannot load from object array because "this.objs" is null from Grinder_iteration_4(win19x86-rtp-rt7-1)

12:42:37  org.spec.jbb.infra.ControllerException: java.lang.RuntimeException: java.lang.IllegalStateException: java.util.concurrent.ExecutionException: java.lang.NullPointerException
12:42:37  	at org.spec.jbb.infra.AbstractController.parallelDo(AbstractController.java:667)
12:42:37  	at org.spec.jbb.infra.AbstractController.snapshotAll(AbstractController.java:1112)
12:42:37  	at org.spec.jbb.infra.AbstractController.doValidation(AbstractController.java:1174)
12:42:37  	at org.spec.jbb.infra.StagedController.runScenario(StagedController.java:174)
12:42:37  	at org.spec.jbb.infra.AbstractController.doRun(AbstractController.java:393)
12:42:37  	at org.spec.jbb.infra.AbstractController.run(AbstractController.java:185)
12:42:37  	at java.base/java.lang.Thread.run(Thread.java:1485)
12:42:37  Caused by: java.lang.RuntimeException: java.lang.IllegalStateException: java.util.concurrent.ExecutionException: java.lang.NullPointerException
12:42:37  	at org.spec.jbb.core.mapreduce.MapReducer.leafAction(MapReducer.java:75)
12:42:37  	at org.spec.jbb.core.mapreduce.MapReducer.map(MapReducer.java:63)
12:42:37  	at org.spec.jbb.infra.AbstractController.parallelDo(AbstractController.java:665)
12:42:37  	... 6 more
12:42:37  Caused by: java.lang.IllegalStateException: java.util.concurrent.ExecutionException: java.lang.NullPointerException
12:42:37  	at org.spec.jbb.infra.BackendAgent.snapshot(BackendAgent.java:308)
12:42:37  	at org.spec.jbb.infra.AbstractAgent.handleRequestGeneric(AbstractAgent.java:239)
12:42:37  	at org.spec.jbb.infra.AbstractAgent.handle(AbstractAgent.java:134)
12:42:37  	at org.spec.jbb.core.comm.Interconnect$Downlink.acceptRequest(Interconnect.java:1152)
12:42:37  	at org.spec.jbb.core.comm.Interconnect$Downlink.accept(Interconnect.java:1104)
12:42:37  	at org.spec.jbb.core.comm.Interconnect.forward(Interconnect.java:314)
12:42:37  	at org.spec.jbb.core.comm.Interconnect.access$700(Interconnect.java:96)
12:42:37  	at org.spec.jbb.core.comm.Interconnect$UplinkImpl.packetSend(Interconnect.java:951)
12:42:37  	at org.spec.jbb.core.comm.Interconnect$UplinkImpl.sendRequest(Interconnect.java:722)
12:42:37  	at org.spec.jbb.core.comm.Interconnect$UplinkImpl.sendRequest(Interconnect.java:816)
12:42:37  	at org.spec.jbb.infra.AbstractController$17.act(AbstractController.java:1115)
12:42:37  	at org.spec.jbb.infra.AbstractController$17.act(AbstractController.java:1112)
12:42:37  	at org.spec.jbb.core.mapreduce.MapReducer.leafAction(MapReducer.java:71)
12:42:37  	... 8 more
12:42:37  Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException
12:42:37  	at java.base/java.util.concurrent.ForkJoinTask.getException(ForkJoinTask.java:568)
12:42:37  	at java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:577)
12:42:37  	at java.base/java.util.concurrent.ForkJoinTask.get(ForkJoinTask.java:1019)
12:42:37  	at org.spec.jbb.infra.BackendAgent.snapshot(BackendAgent.java:302)
12:42:37  	... 20 more
12:42:37  Caused by: java.lang.NullPointerException
12:42:37  	at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)
12:42:37  	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
12:42:37  	at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:483)
12:42:37  	at java.base/java.util.concurrent.ForkJoinTask.getException(ForkJoinTask.java:561)
12:42:37  	... 23 more
12:42:37  Caused by: java.lang.NullPointerException: Cannot load from object array because "this.objs" is null
12:42:37  	at java.base/java.io.ObjectOutputStream$HandleTable.lookup(ObjectOutputStream.java:2119)
12:42:37  	at java.base/java.io.ObjectOutputStream.writeClassDesc(ObjectOutputStream.java:1127)
12:42:37  	at java.base/java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1309)
12:42:37  	at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1079)
12:42:37  	at java.base/java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:325)
12:42:37  	at java.base/java.util.concurrent.ConcurrentHashMap.writeObject(ConcurrentHashMap.java:1442)

Similar NPE occurred at Grinder_iteration_0(win11x86-svl-rt1-1), Grinder_iteration_1/(rtj-win16x86-svl-test-fr3dm-1), Grinder_iteration_2(rtj-win11x86-rtp-test-a4rop-1).

Edit: these three grinder jobs were marked as completed: SUCCESS

JasonFengJ9 avatar Sep 24 '25 18:09 JasonFengJ9

@dmitripivkine pls take a look.

Is there any build with results can be triaged? I have looked briefly and can not find any. We should check machine health somehow. I triaged OOM on Windows some time ago and reason was memory commit request failed during heap expansion, message can be found in Snap traces.

dmitripivkine avatar Sep 24 '25 19:09 dmitripivkine

@hzongaro can someone pls take a look at the NullPointerException. Mode 2 includes -XXgc:disableVirtualLargeObjectHeap, i.e. balanced arraylets

The failures occurred on 0.55 but I've assigned this to 0.56. Grinder on the latest 0.56 https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/55069/

Failing the same way, NullPointerException, https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder_iteration_2/5905 Also OutOfMemoryError, https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder_iteration_4/4419

pshipton avatar Sep 24 '25 19:09 pshipton

As per https://github.com/eclipse-openj9/openj9/issues/22525#issuecomment-3330242168, NPE occurred at Grinder_iteration_0(win11x86-svl-rt1-1), Grinder_iteration_1/(rtj-win16x86-svl-test-fr3dm-1), Grinder_iteration_2(rtj-win11x86-rtp-test-a4rop-1), however these grinder jobs were marked as completed: SUCCESS.

FYI @llxia

JasonFengJ9 avatar Sep 25 '25 12:09 JasonFengJ9

NPEs in Specjbb2015 have seen before, see https://github.com/eclipse-openj9/openj9/issues/21986 (closed as duplicate of https://github.com/eclipse-openj9/openj9/issues/21822).

dmitripivkine avatar Sep 25 '25 13:09 dmitripivkine

I opened the issue for the false positive result problem.

llxia avatar Sep 25 '25 14:09 llxia

Looking to the core for OOM from https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder_iteration_4/4419 This is OOM forced by Excessive GC detection. Mode: Balanced Arraylets mode (offheap disabled). Possibilities are:

  • false Excessive GC detection - probably not a case, heap looks really full
  • potential impact of unhealthy machine to reach Excessive GC criteria like "time spent" (?)
  • GC kept alive something unnecessary potentially
  • unexpected roots to prevent objects to die (like extended live of objects by JIT).
  • just unfortunate circumstances for normally functioning JVM when 2500m heap is too small.

dmitripivkine avatar Sep 26 '25 14:09 dmitripivkine

I have a grinder run in progress with the -Xdump option in hopes of capturing a core file for the NullPointerException.

I also have a second grinder run in progress for SPECjbb2015_0 with an older JDK 21 build — https://na.artifactory.swg-devops.com/ui/native/hyc-rt-java-delivery-generic-local/IBM/java/semeru-certified/21.0/21.0.6.0/ — to try to determine whether this is a regression. SPECjbb2015_0 runs with -Xgcpolicy:balanced without the -XXgc:disableVirtualLargeObjectHeap option, which wouldn't have been supported by the build I'm using, so I believe it should be equivalent to running SPECjbb2015_2 against a more recent build with the -XXgc:disableVirtualLargeObjectHeap option.

hzongaro avatar Sep 30 '25 19:09 hzongaro

I've seen both the OutOfMemoryError and the NullPointerException in my grinder run of SPECjbb2015_0 with the JDK 21 21.0.6.0 build. Unless I've messed up somehow, I believe neither failure represents a recent regression.

[2025-09-30T17:30:07.790Z] =JAVA VERSION OUTPUT BEGIN=
[2025-09-30T17:30:09.795Z] java version "21.0.6" 2025-01-21 LTS
[2025-09-30T17:30:09.795Z] IBM Semeru Runtime Certified Edition 21.0.6.0 (build 21.0.6+7-LTS)
[2025-09-30T17:30:09.795Z] Eclipse OpenJ9 VM 21.0.6.0 (build openj9-0.49.0, JRE 21 Windows Server 2019 amd64-64-Bit Compressed References 20250121_370 (JIT enabled, AOT enabled)
[2025-09-30T17:30:09.795Z] OpenJ9   - 3c3d179854
[2025-09-30T17:30:09.795Z] OMR      - e49875871
[2025-09-30T17:30:09.795Z] JCL      - c5f10f9bd79 based on jdk-21.0.6+7)
...
[2025-09-30T17:34:36.055Z] ===============================================
[2025-09-30T17:34:36.459Z] Running test SPECjbb2015_0 ...
[2025-09-30T17:34:36.885Z] ===============================================
[2025-09-30T17:34:37.291Z] SPECjbb2015_0 Start Time: Tue Sep 30 10:34:37 2025 Epoch Time (ms): 1759253677060
[2025-09-30T17:34:37.697Z] variation: -Xgcpolicy:balanced -Xnocompressedrefs -Xmx2500M -Xmnx128M -XXgc:fvtest_forceCopyForwardHybridMarkCompactRatio=20 -Xverbosegclog
[2025-09-30T17:34:38.099Z] JVM_OPTIONS:  -Xgcpolicy:balanced -Xnocompressedrefs -Xmx2500M -Xmnx128M -XXgc:fvtest_forceCopyForwardHybridMarkCompactRatio=20 -Xverbosegclog 

NullPointerException in https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder_iteration_1/6339/, https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder_iteration_2/5923, https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder_iteration_3/4773/ and https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder_iteration_4/4436

OutOfMemoryError in https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder_iteration_0/6532/ and https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder_iteration_2/5923/

hzongaro avatar Oct 01 '25 02:10 hzongaro

Given that neither of the failure symptoms appear to be recent regressions with arraylets support, I would like to remove the blocker label and move this out of the 0.56 release.

hzongaro avatar Oct 01 '25 13:10 hzongaro