openj9 icon indicating copy to clipboard operation
openj9 copied to clipboard

JIT crash in TestFlushReflectionCache on JDK8 linux Z

Open tajila opened this issue 3 years ago • 2 comments

Failure link

/job_output.php?id=40181434

Optional info

  • intermittent failure (yes):
  • regression or new test: regression
  • if regression, what are the last passing / first failing public SHAs (OpenJ9, OMR, JCL) : First failure Aug 8

Failure output (captured from console output)

[Cmvc196982] [INFO] Cmvc196982 is setting up... 
[Cmvc196982] [INFO] Cmvc196982 is testing testAddingAnnotationsFlushesReflectCache()... 
Unhandled exception
Type=Illegal instruction vmState=0x00040000
J9Generic_Signal_Number=00000048 Signal_Number=00000004 Error_Value=00000000 Signal_Code=00000001
Handler1=000003FF8B3CD808 Handler2=000003FF8B12CA90
gpr0=0000000000000016 gpr1=000003FF8BD7B870 gpr2=000000000000000E gpr3=000003FF8B49F930
gpr4=000003FF4E25AE48 gpr5=000003FF845C80D0 gpr6=0000000000000000 gpr7=0000000000002502
gpr8=000003FF8ABF8028 gpr9=000003FF840934C0 gpr10=000003FF5E1AB548 gpr11=000003FF8A993028
gpr12=000003FF8AD940DC gpr13=000003FF8408FC00 gpr14=000003FF8BD7BCC0 gpr15=000003FF8BD7B8E8
psw=000003FF4E25AE50 mask=0705100180000000 fpc=0008fe00 bea=000003FF8B40F5C0
fpr0 000003ff8aba4f40 (f: 2327465728.000000, d: 2.171952e-311)
fpr1 3e6391608b4e1518 (f: 2337150208.000000, d: 3.644802e-08)
fpr2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
fpr3 3c12a4b000000000 (f: 0.000000, d: 2.526640e-19)
fpr4 000003ff844a3028 (f: 2219454464.000000, d: 2.171898e-311)
fpr5 3e924b0ea000c000 (f: 2684403712.000000, d: 2.725898e-07)
fpr6 000003ff8bd7cbc8 (f: 2346175488.000000, d: 2.171961e-311)
fpr7 3e3a364180008000 (f: 2147516416.000000, d: 6.102942e-09)
fpr8 000000009fed6f30 (f: 2683137792.000000, d: 1.325646e-314)
fpr9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
fpr10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
fpr11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
fpr12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
fpr13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
fpr14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
fpr15 0000000000000000 (f: 0.000000, d: 0.000000e+00)

Compiled_method=sun/instrument/InstrumentationImpl.redefineClasses([Ljava/lang/instrument/ClassDefinition;)V
Target=2_90_20220809_33927 (Linux 3.10.0-1160.66.1.el7.s390x)
CPU=s390x (4 logical CPUs) (0xfa8bc000 RAM)
----------- Stack Backtrace -----------
 (0x000003FF4E25AE50 [<unknown>+0x0])
---------------------------------------
JVMDUMP039I Processing dump event "gpf", detail "" at 2022/08/09 09:44:30 - please wait.
JVMDUMP032I JVM requested System dump using '/tmp/bld_33927/TestFlushReflectionCache_SE80_2/core.20220809.094430.35451.0001.dmp' in response to an event
JVMDUMP010I System dump written to /tmp/bld_33927/TestFlushReflectionCache_SE80_2/core.20220809.094430.35451.0001.dmp
JVMDUMP032I JVM requested Java dump using '/tmp/bld_33927/TestFlushReflectionCache_SE80_2/javacore.20220809.094430.35451.0002.txt' in response to an event


*** Invalid JIT return address 2F787A363438302F in 000003FF8BD75818

tajila avatar Aug 09 '22 19:08 tajila

@r30shah ^^

tajila avatar Aug 09 '22 19:08 tajila

The pasted job id does not upload any diagnostic files to inspect. I have launched grinders on internal farm to see if I can get one. 10x Grinder on Same machine ?build_id=33937 30x Grinder : ?build_id=33938

Both passes. Increasing the number of jobs in grinder (?build_id=33942) to see if I can get another failure which uploads result files.

r30shah avatar Aug 09 '22 20:08 r30shah

Rahil @r30shah, have you had any success reproducing this failure?

hzongaro avatar Aug 24 '22 20:08 hzongaro

@hzongaro No, one with 100x also passes.

r30shah avatar Aug 24 '22 20:08 r30shah

Peter @pshipton, as we haven't had success reproducing this yet, I'd suggest moving it out of the 0.35 release.

hzongaro avatar Aug 25 '22 00:08 hzongaro

Has this been reproduced since Aug 9th?

vij-singh avatar Oct 25 '22 13:10 vij-singh

I see it's "Illegal instruction", which we typically see on VICOM machines. I typically disable these machines in the builds so we don't get regular failures. We do have https://github.com/eclipse-openj9/openj9/issues/9876 tracking an active VICOM problem.

pshipton avatar Oct 25 '22 14:10 pshipton

closing as not reproduceable

tajila avatar Nov 03 '22 18:11 tajila