self icon indicating copy to clipboard operation
self copied to clipboard

Sparc: VM miscopiled with `-O2`

Open nbuwe opened this issue 1 year ago • 4 comments

reldbg build defaults to -O2 and on NetBSD/sparc with gcc (nb2 20230710) 10.5.0 the VM gets SIGBUS when trying to load the world:

#0  Conversion::convertVFrames (this=0x6ebfc8) at vm/src/any/runtime/conversion.cpp:176
#1  0x000c75f8 in Conversion::convert (this=0x6ebfc8) at vm/src/any/runtime/conversion.cpp:66
#2  0x000c8578 in Conversion::doit (this=0x6ebfc8) at vm/src/any/runtime/conversion.cpp:14
#3  0x000e6208 in switchToVMStack (continuation=0xc9380 <ConvertFrame_cont()>) at vm/src/any/runtime/process.cpp:1800
#4  0x000cbd14 in ConvertFrame (isInterp=false, nlrHomeID=14, nlrHome=0xe7ffe030, nlr=true, sp=0xe7ffe030, result=0x40e5785) at vm/src/any/runtime/frame.cpp:518
#5  HandleReturnTrap (result=0x40e5785, sp_of_patched_frame=0xe7ffe030, nlr=<optimized out>, nlrHome=0xe7ffe030, nlrHomeID=14) at vm/src/any/runtime/frame.cpp:588
#6  0x00175954 in ReturnTrapNLR_returnPC ()

Telling reldbg to use more conservative -Og results in the VM that seems to work ok and passes the tests.

Unfortunately, I currently don't have time to debug this further or to binary-search for the -f optimization that is not in -Og but is in -O2 that triggers this.

nbuwe avatar Aug 30 '23 02:08 nbuwe

To narrow it down a bit, -O1 is ok, -Os fails.

nbuwe avatar Aug 30 '23 22:08 nbuwe

So the failure being related to frames was a broad hint and, indeed, -fno-optimize-sibling-calls helps. But the time to load the world and to do --runAutomaticTests is significantly worse for -Os -fno-optimize-sibling-calls than for -O1.

nbuwe avatar Aug 31 '23 01:08 nbuwe

I wonder how much the higher -O levels above -O1 are actually buying us.

I'm having issues getting NetBSD Sparc to run on Qemu (I don't have Sparc hardware anymore) but I'll have a look at this as soon as I get it running.

russellallen avatar Aug 31 '23 05:08 russellallen

Beware that sparc needs a few tweaks, that I think I mentioned in the PR

  • a local copy of .mul in vm/src/sparc/prims/asmPrims_sparc.S b/c v8 stub on NetBSD doesn't follow the .mul ABI (#152) that nothing in the gcc generated code relies on, but Self does. I should probably do a PR that provides one for the very unlikely case that someone runs it on v7 and do a v8 multiplication otherwise
  • workaround for #149 - for which I currently use a version of libc compiled with phk malloc instead of jemalloc (USE_JEMALLOC=no) with an additional implementation of __je_sallocx used by true_size_of_malloced_obj in vm/src/any/runtime/allocation.cpp (it pokes in malloc internals, so no easy way to provide it out of band)

nbuwe avatar Aug 31 '23 11:08 nbuwe