liboqs icon indicating copy to clipboard operation
liboqs copied to clipboard

CI: macOS build failures

Open planetf1 opened this issue 1 year ago • 9 comments

Currently macOS builds are failing

These are failing with gcc on macOS 13 & macOS 14

Screenshot 2024-05-31 at 11 09 48

For macOS 14 :

Component Working Failed
runner 2.316.1 2.316.1
Xcode 15.0.1 15.0.1
gcc Apple 13.2.0 gcc 13.3.0
example log https://github.com/open-quantum-safe/liboqs/actions/runs/9211032819/job/25339585288 https://github.com/open-quantum-safe/liboqs/actions/runs/9309161112/job/25624155404

For example

The current runner software versions are documented here.

Working Failed
Date 2024-05-15 2024-05-29
Image Version 20240514.3 20240526.1
OS Version 13.6.6 (22G630) 13.6.7 (22G720)

planetf1 avatar May 31 '24 11:05 planetf1

I've created a DRAFT PR as a TEST which replaces gcc with 13.2.0 to validate if this is the cause.

  • macOS 13 gcc - SUCCESS

  • macOS 14 gcc - SUCCESS

  • wait for a fix to happen, somehow - but we don't know when

  • debug - try and build with gcc 13.3.0 locally (a) homebrew b) local build), research open issues

  • implement a similar workaround in short term (but must monitor and remove later)

Note that the test PR had a failure from circle CI on doxygen....

Thoughts @bhess @SWilson4 @baentsch ?

planetf1 avatar May 31 '24 11:05 planetf1

Additional commit reverts the version reversion ...

This seems to prove the cause.

Also note that some annotations are added - just due to stderr when installing direct using ruby. Cosmetic, can be cleaned up if we want to go with a workaround

[macos (macos-14, -DOQS_USE_OPENSSL=OFF)](https://github.com/open-quantum-safe/liboqs/actions/runs/9317181419/job/25647067308#step:4:27)
Failed to load cask: [email protected] Cask 'gcc@13' is unreadable: wrong constant name #<Class:0x000000014973f790>

planetf1 avatar May 31 '24 11:05 planetf1

Thanks for looking into this, @planetf1. The doxygen error should be resolved by syncing your fork so that it includes commit a23046ffcea9b16b6bf9e2a4dc9c045316134dc2.

I am OK with merging a workaround, as this does seem like a non-liboqs problem. However, I feel that we should leave this issue (or another one) open as a reminder to remove the workaround whenever possible.

SWilson4 avatar May 31 '24 13:05 SWilson4

Thanks @SWilson4 thought I was up to date. pushed update now, so hopefully will be cleaner (except the annotations)

planetf1 avatar May 31 '24 13:05 planetf1

My initial reason for doing this was to at least understand why the failure. Perhaps we could discuss in the dev call next week & decide if we want the workaround. If so we could use as-is (perhaps squash) or also cleanup the warnings (redirect!)

planetf1 avatar May 31 '24 14:05 planetf1

Thanks @planetf1 for investigating and adding the workaround!

bhess avatar Jun 03 '24 07:06 bhess

@bhess My current plan is to share the findings at the OQS status call so we can discuss/agree. However if there is critical mass of agreement we could merge ahead of time. Another option to consider is moving to a later gcc version.

planetf1 avatar Jun 03 '24 08:06 planetf1

My current plan is to share the findings at the OQS status call so we can discuss/agree.

Sharing findings only at status calls is suboptimal for several reasons:

  1. It slows down development
  2. It robs non-participants of the option to check your findings
  3. It eliminates the option for users to search github for "previously known issues" and learn from them

--> I'd strongly suggest always creating PRs that others can see and comment on. If they do not get merged, at least others (and posteriority) can see and learn; if they get merged, so much the better.

What I find troublesome is that these build failures did not get visible at top-level CI reporting (GH CI report not shown): Created #1827 to track.

baentsch avatar Jun 27 '24 14:06 baentsch

@baentsch very much agree with you on ensuring we always share info through the github issues for all of those reasons - the status call should only be in addition, and as a reminder to review the comment. Hopefully the text here, did address that.

Good catch on addressing the status reporting.

planetf1 avatar Jul 01 '24 09:07 planetf1

It looks like the macos-13 and macos-14 runners now support GCC 14. Might it be worth jumping over 13.3.0 and going straight to 14 to see if that fixes the issue @planetf1?

SWilson4 avatar Nov 25 '24 19:11 SWilson4

Agreed. will aim to look at this in ~next day (won't be before status meeting).

planetf1 avatar Nov 26 '24 13:11 planetf1