openj9 icon indicating copy to clipboard operation
openj9 copied to clipboard

[WIP] Inline superClassTest for transformed checkCast nodes when castClass is runtime variable

Open rmnattas opened this issue 3 years ago • 1 comments

The case of castClass being a runtime variable was only for instanceOf nodes until #15079 (transforms Class.cast calls to checkCast nodes), for this new case we can inline superClassTest to improve performance. This part of common code is used by Z, P and AARCH64.

This PR allows checkCast nodes in the case of variable castClass to inline superClassTest. It also adds array and interface checks for castClass before the superClassTest, if so skip superClassTest, for P and AARCH64. It's already implemented in Z.

Description issue: #15535

WIP:

  • Finish/check AARCH64 code (using/not-using tbz/tbnz, immediate immr:imms)
  • P handle unknown classDepth for superClassTest
  • Testing and benchmark performance

rmnattas avatar Aug 08 '22 16:08 rmnattas

@r30shah @ymanton @0xdaryl

rmnattas avatar Aug 08 '22 16:08 rmnattas

@knn-k Can you have a look at the AArch64 code, espically using/not-using tbz/tbnz and advice on the immr:imms immediate format. I'm confused about the immediate as I found a way for it but it doesn't confirm with the existing 0x400 is immr:imms for 0x10000.

rmnattas avatar Aug 21 '22 21:08 rmnattas

@ymanton Please review the latest P code @knn-k Please review the final modifications

P testing HERE (the 3 non-passing extended tests are not caused by this change) Will run Z and AArch64 next

rmnattas avatar Aug 22 '22 02:08 rmnattas

I thought the check code could be simpler as:

if ((modifiers & (J9AccClassArray | J9AccInterface)) != 0)
	goto falseLabel;

But it does not work for AArch64, because you cannot embed 0x10200 into the tst (immediate) instruction.

knn-k avatar Aug 22 '22 02:08 knn-k

Updated the P code for a faster sequence:

Original Sequence:
 [0xa00000040210830]    11      andis.  GPR_0083, GPR_0083, 1
 [0xa00000040210930]    11      bne     CCR_0080, Outlined Label L0060
 [0xa000000402109d0]    11      andi.   GPR_0083, GPR_0082, 512
 [0xa00000040210ad0]    11      bne     CCR_0080, Outlined Label L0060

New Sequence:
 [    0x7fff65f0dc00]   11      lis     GPR_0068, 0000000000000001
 [    0x7fff65f0dca0]   11      ori     GPR_0068, GPR_0068, 512
 [    0x7fff65f0dd40]   11      and.    GPR_0068, GPR_0067, GPR_0068
 [    0x7fff65f0dde0]   11      bne     CCR_0065, Outlined Label L0044

rmnattas avatar Aug 22 '22 15:08 rmnattas

@knn-k Is a similar sequence to the last comment possible/beneficial in AArch64?

load 0x10200 into a register
and with compare flags set
branch

rmnattas avatar Aug 22 '22 15:08 rmnattas

For benchmark and performance

Running a microbenchmark for the execution time of the Class.cast(obj) call (measured on P): There's 28% improvement when the castClass is a superClass of the object (for the superClassTest inlined sequence) There's 3-4% performance hit when the castClass is not equal or a superClass of the object (for the inlined checks)

For the inlined superClassTest fix for instanceOf for P: There's 20% improvement when the castClass is a superClass of the object There's no performance hit as the current inlined code exist but ineffective

Will try to run a workload benchmark for overall effect.

rmnattas avatar Aug 22 '22 16:08 rmnattas

Is a similar sequence to the last comment possible/beneficial in AArch64?

loadConstant32(cg, node, (J9AccClassArray | J9AccInterface), scratchRegister2); // This requires two instructions on AArch64
generateTestInstruction(cg, node, scratchRegister1, scratchRegister2); // 32-bit comparison
generateConditionalBranchInstruction(cg, TR::InstOpCode::b_cond, node, falseLabel, TR::CC_NE);

Not tested. You need only one conditional branch with this sequence, while you need an additional temporary register.

knn-k avatar Aug 22 '22 22:08 knn-k

Updated AArch64 sequence

Original Sequence
[    0xffff80ddd4c0] 7210007f 11    tstimmw         w3, 0x10000
[    0xffff80ddd550] 54000ba1 11    b.ne    Outlined Label L0038    
[    0xffff80ddd5e0] 7217007f 11    tstimmw         w3, 0x200
[    0xffff80ddd670] 54000b61 11    b.ne    Outlined Label L0038

New sequence
[    0xffff8e8d3550] 52804002 11    movzw   w2, 0x0200
[    0xffff8e8d35e0] 72a00022 11    movkw   w2, 0x0001, LSL #16
[    0xffff8e8d3670] 6a02007f 11    tstw    w3, w2
[    0xffff8e8d3700] 54000b61 11    b.ne    Outlined Label L0038

rmnattas avatar Aug 23 '22 14:08 rmnattas

AArch64 performance

Running a microbenchmark for the execution time of the Class.cast(obj) call: There's 52% improvement when the castClass is a superClass of the object (for the superClassTest inlined sequence) There's 1.1% performance hit when the castClass is not equal or a superClass of the object (for the inlined checks)

Unlike P, the new sequence improvement is not significant. The one branch approach yielded the same performance (only 0.05% faster) than the original two branch sequence. In P the difference was almost 20% causing the two branch approach to almost cancel the benefit of this PR.

rmnattas avatar Aug 23 '22 14:08 rmnattas

This PR is ready to merge P tests: https://hyc-runtimes-jenkins.swg-devops.com/job/Pipeline-Build-Test-Personal/13950/ AArch64 test: https://hyc-runtimes-jenkins.swg-devops.com/job/Pipeline-Build-Test-Personal/13964/ (old-sequence)

rmnattas avatar Aug 23 '22 14:08 rmnattas

The AArch64 code looks good to me.

knn-k avatar Aug 24 '22 01:08 knn-k

Jenkins test sanity.functional all jdk8,jdk11

ymanton avatar Aug 24 '22 16:08 ymanton

Tests failed due to infrastructure issues, for example:

17:29:24  Cannot contact cent6-x64-1: java.lang.InterruptedException

I'll retry tomorrow.

ymanton avatar Aug 25 '22 02:08 ymanton

Jenkins test sanity.functional all jdk8,jdk11

knn-k avatar Aug 26 '22 00:08 knn-k

https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.functional_s390x_linux_Personal_testList_0/206/ https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.functional_s390x_linux_Personal_testList_1/198/ https://openj9-jenkins.osuosl.org/job/Test_openjdk8_j9_sanity.functional_s390x_linux_Personal_testList_0/176/ https://openj9-jenkins.osuosl.org/job/Test_openjdk8_j9_sanity.functional_s390x_linux_Personal_testList_1/176/

Jobs on s390x_linux above failed with the following error in compiling org/apache/tools/ant/types/Resource.as(Ljava/lang/Class;)Ljava/lang/Object;. I don't know if they are caused by this PR or not.

[2022-08-26T06:01:48.798Z] Unhandled exception
[2022-08-26T06:01:48.798Z] Type=Segmentation error vmState=0x0005ff06
[2022-08-26T06:01:48.798Z] J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=f0b33e48 Signal_Code=00000001
[2022-08-26T06:01:48.798Z] Handler1=000003FF9DA47B28 Handler2=000003FF9D932F78 InaccessibleAddress=0000000000000000
[2022-08-26T06:01:48.798Z] gpr0=0000000000000000 gpr1=000003FF9CF97470 gpr2=0000000000000000 gpr3=000003FF9D2F23F0
[2022-08-26T06:01:48.798Z] gpr4=000003FF9CFB58E4 gpr5=0000000000000000 gpr6=0000000000000000 gpr7=0000000000000000
[2022-08-26T06:01:48.798Z] gpr8=0000000000000000 gpr9=000003FF9D339628 gpr10=000003FF9D336B00 gpr11=000003FF0D0D8890
[2022-08-26T06:01:48.798Z] gpr12=000003FF9D335000 gpr13=0000000000000000 gpr14=000003FF9CFB58E4 gpr15=000003FF756760E8
[2022-08-26T06:01:48.798Z] psw=000003FF9CFB595A mask=0705000180000000 fpc=0088fe00 bea=000003FF9CFB58FE
[2022-08-26T06:01:48.798Z] fpr0 409e000000000000 (f: 0.000000, d: 1.920000e+03)
[2022-08-26T06:01:48.798Z] fpr1 3fb5f15f15f15f16 (f: 368140064.000000, d: 8.571429e-02)
[2022-08-26T06:01:48.798Z] fpr2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-26T06:01:48.798Z] fpr3 3fd999cbe9982871 (f: 3919063040.000000, d: 4.000120e-01)
[2022-08-26T06:01:48.798Z] fpr4 409e000000000000 (f: 0.000000, d: 1.920000e+03)
[2022-08-26T06:01:48.798Z] fpr5 3fcc720a74b7854b (f: 1958184320.000000, d: 2.222303e-01)
[2022-08-26T06:01:48.798Z] fpr6 3fc75ecaddb24359 (f: 3719447296.000000, d: 1.825803e-01)
[2022-08-26T06:01:48.798Z] fpr7 3f7e17d2dc43b59b (f: 3695425024.000000, d: 7.346939e-03)
[2022-08-26T06:01:48.798Z] fpr8 00000000102e5be8 (f: 271473632.000000, d: 1.341258e-315)
[2022-08-26T06:01:48.798Z] fpr9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-26T06:01:48.798Z] fpr10 00000000102e5250 (f: 271471168.000000, d: 1.341246e-315)
[2022-08-26T06:01:48.798Z] fpr11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-26T06:01:48.798Z] fpr12 00000000102e5bf0 (f: 271473664.000000, d: 1.341258e-315)
[2022-08-26T06:01:48.798Z] fpr13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-26T06:01:48.798Z] fpr14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-26T06:01:48.798Z] fpr15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-26T06:01:48.798Z] Module=/home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_s390x_linux_Personal_testList_0/openjdkbinary/j2sdk-image/lib/default/libj9jit29.so
[2022-08-26T06:01:48.798Z] Module_base_address=000003FF9C500000
[2022-08-26T06:01:48.798Z] 
[2022-08-26T06:01:48.798Z] Method_being_compiled=org/apache/tools/ant/types/Resource.as(Ljava/lang/Class;)Ljava/lang/Object;
[2022-08-26T06:01:48.798Z] Target=2_90_20220825_494 (Linux 3.10.0-1160.71.1.el7.s390x)
[2022-08-26T06:01:48.798Z] CPU=s390x (4 logical CPUs) (0x1ec1b4000 RAM)
[2022-08-26T06:01:48.798Z] ----------- Stack Backtrace -----------
[2022-08-26T06:01:48.798Z] _ZN2TR21S390BranchInstruction30assignRegistersAndDependenciesE16TR_RegisterKinds+0x1d2 (0x000003FF9CFB595A [libj9jit29.so+0xab595a])
[2022-08-26T06:01:48.798Z] _ZN3OMR1Z11Instruction15assignRegistersE16TR_RegisterKinds+0x2c (0x000003FF9D01B1D4 [libj9jit29.so+0xb1b1d4])
[2022-08-26T06:01:48.798Z] _ZN3OMR13CodeGenerator20doRegisterAssignmentE16TR_RegisterKinds+0xec (0x000003FF9CB16D7C [libj9jit29.so+0x616d7c])
[2022-08-26T06:01:48.798Z] _ZN3OMR12CodeGenPhase29performRegisterAssigningPhaseEPN2TR13CodeGeneratorEPNS1_12CodeGenPhaseE+0xa4 (0x000003FF9CB1D094 [libj9jit29.so+0x61d094])
[2022-08-26T06:01:48.798Z] _ZN3OMR12CodeGenPhase10performAllEv+0x10e (0x000003FF9CB1E1CE [libj9jit29.so+0x61e1ce])
[2022-08-26T06:01:48.798Z] _ZN3OMR13CodeGenerator12generateCodeEv+0x60 (0x000003FF9CB1A828 [libj9jit29.so+0x61a828])
[2022-08-26T06:01:48.798Z] _ZN3OMR11Compilation7compileEv+0xe28 (0x000003FF9CB4D888 [libj9jit29.so+0x64d888])
[2022-08-26T06:01:48.799Z] _ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadPNS_11CompilationEP17TR_ResolvedMethodR11TR_J9VMBaseP19TR_OptimizationPlanRKNS_16SegmentAllocatorE+0x4aa (0x000003FF9C686622 [libj9jit29.so+0x186622])
[2022-08-26T06:01:48.799Z] _ZN2TR28CompilationInfoPerThreadBase14wrappedCompileEP13J9PortLibraryPv+0x35a (0x000003FF9C6876BA [libj9jit29.so+0x1876ba])
[2022-08-26T06:01:48.799Z] omrsig_protect+0x366 (0x000003FF9D9340E6 [libj9prt29.so+0x340e6])
[2022-08-26T06:01:48.799Z] _ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadP21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x332 (0x000003FF9C684D0A [libj9jit29.so+0x184d0a])
[2022-08-26T06:01:48.799Z] _ZN2TR24CompilationInfoPerThread12processEntryER21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x1d2 (0x000003FF9C6853B2 [libj9jit29.so+0x1853b2])
[2022-08-26T06:01:48.799Z] _ZN2TR24CompilationInfoPerThread14processEntriesEv+0x3c6 (0x000003FF9C683DCE [libj9jit29.so+0x183dce])
[2022-08-26T06:01:48.799Z] _ZN2TR24CompilationInfoPerThread3runEv+0x66 (0x000003FF9C6842AE [libj9jit29.so+0x1842ae])
[2022-08-26T06:01:48.799Z] _Z30protectedCompilationThreadProcP13J9PortLibraryPN2TR24CompilationInfoPerThreadE+0x9a (0x000003FF9C684392 [libj9jit29.so+0x184392])
[2022-08-26T06:01:48.799Z] omrsig_protect+0x366 (0x000003FF9D9340E6 [libj9prt29.so+0x340e6])
[2022-08-26T06:01:48.799Z] _Z21compilationThreadProcPv+0x20a (0x000003FF9C68483A [libj9jit29.so+0x18483a])
[2022-08-26T06:01:48.799Z] thread_wrapper+0x114 (0x000003FF9D885A14 [libj9thr29.so+0x5a14])
[2022-08-26T06:01:48.799Z] start_thread+0xea (0x000003FF9E388312 [libpthread.so.0+0x8312])
[2022-08-26T06:01:48.799Z]  (0x000003FF9E18E232 [libc.so.6+0x10e232])
[2022-08-26T06:01:48.799Z] ---------------------------------------

knn-k avatar Aug 26 '22 06:08 knn-k

I have a strong suspicion that failures on Z are related to this change. Unfortunately I can not tell more as the failure happens while building test and none of them uploads any result files. Let me cherry pick this change and launch a build internally to see if it is reproducible

r30shah avatar Aug 26 '22 12:08 r30shah

Failure on Z was 100% reproducible. I looked through the log file and found that when we emit super class test on Z for checkCast node with dynamic cast class, we emitted a branch instruction with NULL call label (Instead of helper call label). I have fixed that in https://github.com/r30shah/openj9/commit/0623889dd8503703ae67f80b29764450faa6ebd5. @rmnattas Do you want to include this change as part of this PR? Or want me to open up separate PR which would need to be merged before this one.

r30shah avatar Aug 26 '22 21:08 r30shah

@r30shah Added the commit to this PR, and rebased it before my changes. @ymanton Can you run the tests again.

rmnattas avatar Aug 29 '22 13:08 rmnattas

Jenkins test sanity.functional all jdk8,jdk11

ymanton avatar Aug 29 '22 15:08 ymanton

Jenkins test sanity.functional all jdk8,jdk11

ymanton avatar Aug 30 '22 14:08 ymanton

Only a single failure (on AIX) in cmdLineTester_libpathTestRtf_0:

[2022-08-30T16:21:30.273Z]  [ERR] Exception in thread "main" java.awt.AWTError: Can't connect to X11 window server using 'unix:0' as the value of the DISPLAY variable.
[2022-08-30T16:21:30.273Z]  [ERR] 	at sun.awt.X11GraphicsEnvironment.initDisplay(Native Method)
[2022-08-30T16:21:30.273Z]  [ERR] 	at sun.awt.X11GraphicsEnvironment.access$200(X11GraphicsEnvironment.java:65)
[2022-08-30T16:21:30.273Z]  [ERR] 	at sun.awt.X11GraphicsEnvironment$1.run(X11GraphicsEnvironment.java:115)
[2022-08-30T16:21:30.273Z]  [ERR] 	at java.security.AccessController.doPrivileged(AccessController.java:691)
[2022-08-30T16:21:30.273Z]  [ERR] 	at sun.awt.X11GraphicsEnvironment.<clinit>(X11GraphicsEnvironment.java:74)
[2022-08-30T16:21:30.273Z]  [ERR] 	at java.lang.Class.forNameImpl(Native Method)
[2022-08-30T16:21:30.273Z]  [ERR] 	at java.lang.Class.forName(Class.java:340)
[2022-08-30T16:21:30.273Z]  [ERR] 	at java.awt.GraphicsEnvironment.createGE(GraphicsEnvironment.java:103)
[2022-08-30T16:21:30.273Z]  [ERR] 	at java.awt.GraphicsEnvironment.getLocalGraphicsEnvironment(GraphicsEnvironment.java:82)
[2022-08-30T16:21:30.273Z]  [ERR] 	at sun.awt.X11.XToolkit.<clinit>(XToolkit.java:131)
[2022-08-30T16:21:30.273Z]  [ERR] 	at java.lang.Class.forNameImpl(Native Method)
[2022-08-30T16:21:30.273Z]  [ERR] 	at java.lang.Class.forName(Class.java:340)
[2022-08-30T16:21:30.273Z]  [ERR] 	at java.awt.Toolkit$2.run(Toolkit.java:860)
[2022-08-30T16:21:30.273Z]  [ERR] 	at java.awt.Toolkit$2.run(Toolkit.java:855)
[2022-08-30T16:21:30.273Z]  [ERR] 	at java.security.AccessController.doPrivileged(AccessController.java:691)
[2022-08-30T16:21:30.273Z]  [ERR] 	at java.awt.Toolkit.getDefaultToolkit(Toolkit.java:854)
[2022-08-30T16:21:30.273Z]  [ERR] 	at java.awt.Toolkit.getEventQueue(Toolkit.java:1736)
[2022-08-30T16:21:30.273Z]  [ERR] 	at java.awt.EventQueue.isDispatchThread(EventQueue.java:1071)
[2022-08-30T16:21:30.273Z]  [ERR] 	at javax.swing.SwingUtilities.isEventDispatchThread(SwingUtilities.java:1366)
[2022-08-30T16:21:30.273Z]  [ERR] 	at javax.swing.text.StyleContext.reclaim(StyleContext.java:454)
[2022-08-30T16:21:30.273Z]  [ERR] 	at javax.swing.text.StyleContext.addAttribute(StyleContext.java:311)
[2022-08-30T16:21:30.273Z]  [ERR] 	at javax.swing.text.StyleContext$NamedStyle.addAttribute(StyleContext.java:1501)
[2022-08-30T16:21:30.273Z]  [ERR] 	at javax.swing.text.StyleContext$NamedStyle.setName(StyleContext.java:1312)
[2022-08-30T16:21:30.273Z]  [ERR] 	at javax.swing.text.StyleContext$NamedStyle.<init>(StyleContext.java:1259)
[2022-08-30T16:21:30.273Z]  [ERR] 	at javax.swing.text.StyleContext.addStyle(StyleContext.java:107)
[2022-08-30T16:21:30.273Z]  [ERR] 	at javax.swing.text.StyleContext.<init>(StyleContext.java:87)
[2022-08-30T16:21:30.273Z]  [ERR] 	at javax.swing.text.DefaultStyledDocument.<init>(DefaultStyledDocument.java:106)
[2022-08-30T16:21:30.273Z]  [ERR] 	at org.openj9.test.libpath.Rtf.convert(Rtf.java:42)
[2022-08-30T16:21:30.273Z]  [ERR] 	at org.openj9.test.libpath.Rtf.main(Rtf.java:60)

Looks like this is a known issue specific to AIX so I'm going to merge: https://github.com/eclipse-openj9/openj9/issues/14943

ymanton avatar Aug 31 '22 14:08 ymanton

@rmnattas can you open a PR for the v0.35.0-release branch as well?

ymanton avatar Aug 31 '22 14:08 ymanton