openj9 icon indicating copy to clipboard operation
openj9 copied to clipboard

Prefer local compilations in more cases

Open cjjdespres opened this issue 2 years ago • 11 comments

We now prefer to compile locally when local sync compilation is enabled and the method in question is cold and synchronous.

A function was also passed the incorrect pc when adding a remote compilation request. This was also fixed.

Signed-off-by: Christian Despres [email protected]

cjjdespres avatar Jul 31 '22 18:07 cjjdespres

Attn @mpirvu. The fix you suggested appears to work.

cjjdespres avatar Jul 31 '22 18:07 cjjdespres

Great!

mpirvu avatar Aug 01 '22 14:08 mpirvu

Actually, my suggested fix was to provide the correct startPC to addMethodToBeCompiled() when we queue the remote retry. Not issuing the retry for cold sync compilations is more of a heuristic change.

mpirvu avatar Aug 01 '22 14:08 mpirvu

Oh, sure. I can make that change too. I only tried one of the failing tests (DaaLoadTest_all_special_5m_18) a few times as well, so I should probably grind the failing tests just to make sure it's totally gone.

cjjdespres avatar Aug 02 '22 00:08 cjjdespres

The failing tests on X all pass. I haven't tried power yet, but I doubt that will make a difference.

cjjdespres avatar Aug 04 '22 13:08 cjjdespres

When the tests passed, did you have both the "startPC" change and the change in heuristics in preferLocalCompilations()? The change in heuristics may mask the real bug, so it would be nice to validate that the "startPC" change in addMethodToBeCompiled() call is the actual fix. I agree that both changes should be delivered in this PR. Please add a couple of sentences in the commit description detailing what has changed.

mpirvu avatar Aug 04 '22 13:08 mpirvu

I haven't tested only the startPC part of it, only the heuristic change and both combined. I'll test the startPC part alone.

cjjdespres avatar Aug 04 '22 13:08 cjjdespres

It appears to segfault still, but in a slightly different way:

DLT stderr Type=Segmentation error vmState=0x00050000
DLT stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
DLT stderr Handler1=00007FF49897BC10 Handler2=00007FF498778E70 InaccessibleAddress=0000000000000008
DLT stderr RDI=0000000000000000 RSI=00007FF46C6A8B70 RAX=0000000000000000 RBX=00007FF46FEE8F30
DLT stderr RCX=0000000000000002 RDX=0000000000000220 R8=00007FF46CDDEE50 R9=0000000000000000
DLT stderr R10=00007FF3F40008F0 R11=00007FF46C6A8425 R12=00007FF46CDDEE50 R13=00000000000000AC
DLT stderr R14=00007FF494136130 R15=00007FF3F4007D10
DLT stderr RIP=00007FF4934A8DC2 GS=0000 FS=0000 RSP=00007FF46C6A86C0
DLT stderr EFlags=0000000000010206 CS=0033 RBP=00007FF46C6A8B70 ERR=0000000000000004
DLT stderr TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=0000000000000008
DLT stderr xmm0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
DLT stderr xmm1 98b8439000000000 (f: 0.000000, d: -1.361456e-189)
DLT stderr xmm2 00000000b7654321 (f: 3076866816.000000, d: 1.520174e-314)
DLT stderr xmm3 00007ff3f40004b0 (f: 4093641984.000000, d: 6.950799e-310)
DLT stderr xmm4 00000000005b8518 (f: 5997848.000000, d: 2.963331e-317)
DLT stderr xmm5 00007ff3f4000080 (f: 4093640704.000000, d: 6.950799e-310)
DLT stderr xmm6 00000000005b8b69 (f: 5999465.000000, d: 2.964130e-317)
DLT stderr xmm7 000000000053dab8 (f: 5495480.000000, d: 2.715128e-317)
DLT stderr xmm8 6154656572665f00 (f: 1919311616.000000, d: 7.168766e+160)
DLT stderr xmm9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
DLT stderr xmm10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
DLT stderr xmm11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
DLT stderr xmm12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
DLT stderr xmm13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
DLT stderr xmm14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
DLT stderr xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
DLT stderr Module=/root/openj9-openjdk-jdk17/build/linux-x86_64-server-release/images/jdk/lib/default/libj9jit29.so
DLT stderr Module_base_address=00007FF49333D000
DLT stderr 
DLT stderr Method_being_compiled=<unknown>
DLT stderr Target=2_90_20220731_000000 (Linux 5.4.0-122-generic)
DLT stderr CPU=amd64 (8 logical CPUs) (0x3e8847000 RAM)
DLT stderr ----------- Stack Backtrace -----------
DLT stderr _ZN2TR15CompilationInfo21addMethodToBeCompiledERNS_24IlGeneratorMethodDetailsEPv19CompilationPrioritybP19TR_OptimizationPlanPb13TR_YesNoMaybe+0x332 (0x00007FF4934A8DC2 [libj9jit29.so+0x16bdc2])
DLT stderr _ZN2TR15CompilationInfo23compileOnSeparateThreadEP10J9VMThreadRNS_24IlGeneratorMethodDetailsEPv13TR_YesNoMaybeP23TR_CompilationErrorCodePbP19TR_OptimizationPlan+0x12f9 (0x00007FF4934AAD29 [libj9jit29.so+0x16dd29])
DLT stderr _ZN2TR15CompilationInfo13compileMethodEP10J9VMThreadRNS_24IlGeneratorMethodDetailsEPv13TR_YesNoMaybeP23TR_CompilationErrorCodePbP19TR_OptimizationPlan+0x23d (0x00007FF4934AB18D [libj9jit29.so+0x16e18d])
DLT stderr j9jit_testarossa_err+0x158 (0x00007FF4934D55D8 [libj9jit29.so+0x1985d8])
DLT stderr _ZN37VM_DebugBytecodeInterpreterCompressed3runEP10J9VMThread+0x1fd49 (0x00007FF498A305F9 [libj9vm29.so+0xf15f9])
DLT stderr debugBytecodeLoopCompressed+0xb6 (0x00007FF498A108A6 [libj9vm29.so+0xd18a6])
DLT stderr  (0x00007FF498A74932 [libj9vm29.so+0x135932])

cjjdespres avatar Aug 04 '22 14:08 cjjdespres

That's bad. The heuristic change only masks the problem which appears when a recompilation is scheduled, so we need to fix the actual problem.

mpirvu avatar Aug 04 '22 14:08 mpirvu

Since the crash is exactly in addMethosToBeCompiled() it shouldn't be too hard to find the culprit instruction.

mpirvu avatar Aug 04 '22 14:08 mpirvu

I've updated the description, since #15601 is fixed by #15686.

cjjdespres avatar Aug 10 '22 20:08 cjjdespres

I am waiting for #15686 to finish testing and be merged. Then we are going to rebase this one to get the fix from #15686 and then I am going to start testing this one.

mpirvu avatar Aug 10 '22 21:08 mpirvu

Please rebase so that I can start testing.

mpirvu avatar Aug 10 '22 21:08 mpirvu

Rebase this one on #15686?

cjjdespres avatar Aug 10 '22 22:08 cjjdespres

Oh, I see that the other PR has been merged now.

cjjdespres avatar Aug 10 '22 22:08 cjjdespres

Rebased onto master.

cjjdespres avatar Aug 10 '22 22:08 cjjdespres

jenkins test sanity plinuxjit,xlinuxjit,zlinuxjit jdk17

mpirvu avatar Aug 10 '22 22:08 mpirvu

Tests have passed. Hence, merging.

mpirvu avatar Aug 11 '22 12:08 mpirvu