ldc icon indicating copy to clipboard operation
ldc copied to clipboard

GHA: Promote macOS-arm64 cross-compilation job to full native job

Open kinke opened this issue 7 months ago • 7 comments

Using the new CI runners (awesome performance!).

kinke avatar Dec 05 '23 22:12 kinke

2 remaining failures:

  • std.internal.math.gammafunction unittests, with enabled optimizations only
  • lit-test driver/config_diag.d

These work for Cirrus CI, on macOS 12 (not 14, and surely a different Xcode version too).

kinke avatar Feb 08 '24 21:02 kinke

2 remaining failures:

  • std.internal.math.gammafunction unittests, with enabled optimizations only
  • lit-test driver/config_diag.d

These work for Cirrus CI, on macOS 12 (not 14, and surely a different Xcode version too).

Hmm, some strange miscompile somehow?

lit-test driver/config_diag.d works for me, macOS 14.2.1, LLVM 17, Apple clang 15.0.0. And the Phobos failure also works locally for me:

❯ bin/ldc2 -O -main -unittest -run ../ldc/runtime/phobos/std/internal/math/gammafunction.d
1 modules passed unittests

JohanEngelen avatar Feb 08 '24 23:02 JohanEngelen

Before merging this PR, I think I should download the artifacts (that's possible right?) and compare the output of the gamma unittest with my local build, and see if I can figure out what the miscompile is. Otherwise, I fear we release with a somehow miscompiling compiler...

JohanEngelen avatar Feb 10 '24 22:02 JohanEngelen

I downloaded the osx-universal artifact:

  • bin/ldc2 -O -main -unittest -run import/std/internal/math/gammafunction.d Passes fine. Should I be running something else?
  • bin/ldc2 -conf=/Users/johan/ldc/ldc/tests/driver/inputs/noswitches.conf reproduces (it works with other LDC, but crashes with the artifact ldc)

JohanEngelen avatar Feb 10 '24 23:02 JohanEngelen

About the bin/ldc2 -conf=/Users/johan/ldc/ldc/tests/driver/inputs/noswitches.conf failure. I may have found some hints:

  • it fails while throwing an exception. We intend to throw (and catch) the exception, that is exactly what the test is testing (throw new Exception("Could not look up switches in " ~ cast(string) dCfPath);).
  • after some searching I think it is the only case where we throw an exception in the compiler. What I mean is: I think it is the only CI test where inside the compiler an exception is thrown.
  • when loading the ldc2 binary into lldb and running the test with -conf=, this is the output:
(lldb) run -conf=/Users/johan/ldc/ldc/tests/driver/inputs/noswitches.conf
Process 5078 launched: '/Users/johan/ldc/test_gha/ldc2-ce3f8516-osx-universal/bin/ldc2' (arm64)
Process 5078 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000195ebddb4 
         libunwind.dylib`libunwind::CFI_Parser<libunwind::LocalAddressSpace>::decodeFDE(libunwind::LocalAddressSpace&,
         unsigned long, libunwind::CFI_Parser<libunwind::LocalAddressSpace>::FDE_Info*, 
         libunwind::CFI_Parser<libunwind::LocalAddressSpace>::CIE_Info*, bool) + 48
libunwind.dylib`libunwind::CFI_Parser<libunwind::LocalAddressSpace>::decodeFDE:
  • when trying to check the backtrace (bt), lldb outputs a ton of these errors
(lldb) bt
error: unable to find CIE at 0x33bbc for cie_id = 0xfffd1888 for entry at 0x5440.
error: unable to find CIE at 0x3887c for cie_id = 0xfffcf588 for entry at 0x7e00.
error: unable to find CIE at 0x10f38 for cie_id = 0xffff843c for entry at 0x9370.
error: unable to find CIE at 0x174b4 for cie_id = 0xffff2930 for entry at 0x9de0.
error: unable to find CIE at 0x3c990 for cie_id = 0xfffcd9e4 for entry at 0xa370.
  • CIE is broken? https://stackoverflow.com/questions/23914453/lldb-unable-to-find-cie "CIE means Common Information Entry and is related to the Dwarf debug format."

JohanEngelen avatar Feb 10 '24 23:02 JohanEngelen

The gammafunction module consistently fails on the new M1 GHA runners for the vanilla-LLVM jobs too, using vanilla LLVM 16 & 17. The config_diag.d lit-test works there though (different LLVM, no assertions, different host compiler, no LTO, no PGO...).

kinke avatar Feb 11 '24 11:02 kinke

@JohanEngelen: So wrt. gammafunction, I'd expect you to see it too, with the regular Phobos unittest runner. - Wrt. the thrown exception for the extra .conf, I'm wondering how the current CI artifacts behave (cross-compiled, but still with PGO + LTO + mimalloc IIRC). [And note that we don't compile the CI artifacts with -g, otherwise they'd be huge.]

kinke avatar Feb 24 '24 17:02 kinke

[Draft because of random Pure virtual function called! errors (at compiler runtime) in first experiments in #4604, very roughly 0-5 per CI run.] The previous issues are resolved by now.

kinke avatar Apr 13 '24 23:04 kinke