cabal icon indicating copy to clipboard operation
cabal copied to clipboard

PackageTests/NewBuild/T3827 fails on MacOS on GHC 8.10.7

Open robx opened this issue 3 years ago • 26 comments

Affected:

  • macOS with GHC >= 8.10.7
  • Linux with GHC = 9.0.1, 9.2.1

PackageTests/NewBuild/T3827 fails in CI on MacOS on GHC 8.10.7:

[1 of 1] Compiling P                ( P.hs, /Users/runner/work/cabal/cabal/cabal-testsuite/PackageTests/NewBuild/T3827/cabal.dist/work/dist/build/x86_64-osx/ghc-8.10.7/p-1.0/build/P.p_o )

P.hs:1:8: error:
Error:     Could not find module ‘Prelude’
    Perhaps you haven't installed the profiling libraries for package ‘base-4.14.3.0’?
    Use -v (or `:set -v` in ghci) to see a list of the files searched for.
  |
1 | module P where
  |        ^
CallStack (from HasCallStack):
  withMetadata, called at src/Distribution/Simple/Utils.hs:374:14 in Cabal-3.7.0.0-inplace:Distribution.Simple.Utils
-----BEGIN CABAL OUTPUT-----
Error: cabal: Failed to build p-1.0-inplace.
Failed to build q-1.0 because it depends on q-1.0 which itself failed to build.
-----END CABAL OUTPUT-----

*** unexpected failure for PackageTests/NewBuild/T3827/cabal.test.hs

https://github.com/haskell/cabal/runs/5412517972?check_suite_focus=true

It looks like a profiling build of base isn't available, might be a CI environment issue?

robx avatar Mar 03 '22 21:03 robx

The test seems to pass for older GHC versions, while for newer those particular tests aren't run in CI.

robx avatar Mar 03 '22 21:03 robx

I can't find the ticket now, but I think indeed some newer GHCs in ghcup don't have profiling libs bundled. Tough luck.

Mikolaj avatar Mar 03 '22 21:03 Mikolaj

Hmm, there's this: https://gitlab.haskell.org/ghc/ghc/-/issues/20707 but it seems to be a new 9.2 issue (as opposed to 9.0 even)

robx avatar Mar 03 '22 21:03 robx

Yes, I think in a cabal ticket somebody said that also happened for some other GHCs. I might have misremembered though.

Mikolaj avatar Mar 03 '22 22:03 Mikolaj

After enabling the cli-suite in ci for ghc-9.0.2 i've observed this test also fails for linux and ghc-9.0.2 installed with ghcup (but no for ghc-9.0.1 and ghc-9.2.1): https://github.com/jneira/cabal/runs/5445996258?check_suite_focus=true#step:15:289

P.hs:1:8: error:
Error:     Could not find module `Prelude'
    Perhaps you haven't installed the profiling libraries for package `base-4.15.1.0'?
    Use -v (or `:set -v` in ghci) to see a list of the files searched for.
  |
1 | module P where
  |        ^
CallStack (from HasCallStack):
  withMetadata, called at src/Distribution/Simple/Utils.hs:374:14 in Cabal-3.7.0.0-inplace:Distribution.Simple.Utils

It seems that ghc version has no profiled boot libraries

jneira avatar Mar 07 '22 10:03 jneira

Thanks, I've updated the ticket to try to cover the affected versions.

robx avatar Mar 07 '22 11:03 robx

sorry the affected version is 9.0.2, I ve corrected the comment about

jneira avatar Mar 07 '22 11:03 jneira

I was playing with it and bumped into something strange. Here's my idea: this issue is not our bug, rather it's a GHC (packaging) bug in certain versions of GHC. I thought we can specify exactly which versions of GHC are affected and close it. To that end I made the following change in the test:

-  missesProfilingLinux <- isGhcVersion ">= 9.0.2"
+  missesProfilingLinux <- isGhcVersion "== 9.0.2"
...
-  missesProfilingOsx <- isGhcVersion ">= 8.10.7"
+  missesProfilingOsx <- isGhcVersion "== 8.10.7"

(https://github.com/ulysses4ever/cabal/commit/316bc21301c1a11d01a91b9743261166c219e101) I expected this change to go green on CI since the switch from 9.2.1 to 9.2.3 (6ce51188), because, as I verify locally, GHC 9.2.3 does have profiling libraries (also, according to GHC bug tracker it was fixed in 9.2.2 even). But the CI went red again (on 9.2.3) because of no profiling libraries again! Does anyone have an idea how that is possible?

ulysses4ever avatar Jun 29 '22 14:06 ulysses4ever

Impossible. Perhaps the version CI uses has lost the profiling libraries? Is it obtained from GHA or ghcup or where?

Mikolaj avatar Jun 29 '22 15:06 Mikolaj

This is the same CI that haskell/cabal has, so: GitHub Action haskell/action/setup@v1 which, in turn, uses ghcup.

ulysses4ever avatar Jun 30 '22 00:06 ulysses4ever

I can only find a report about 9.2.2, not 9.2.3

https://gitlab.haskell.org/ghc/ghc/-/issues/21190

but perhaps ghcup repackages those (and fixes 9.2.2 and, implausibly, breaks 9.2.3)? I haven't looked at ghcup bugtracker (but open and closed tickets).

Mikolaj avatar Jun 30 '22 06:06 Mikolaj

@hasufell do you have an idea how it's possible that I get "profiling libraries not found" with 9.2.3? https://github.com/ulysses4ever/cabal/runs/7083135188?check_suite_focus=true

ulysses4ever avatar Jun 30 '22 11:06 ulysses4ever

Does anyone have an idea how that is possible?

The haskell setup action makes it hard to see which bindist exactly is installed. I've since then switched to just using ghcup directly, especially since it's pre-installed on all github actions images (and a recent version, unlike the haskell setup action).

It's possible that only some bindists are affected. E.g. hadrian has been incredibly buggy and some releases have a mixture of make and hadrian assembled bindists. Not sure about 9.2.3.

hasufell avatar Jun 30 '22 12:06 hasufell

This is now a heisenbug on GHC 9.2.3: https://github.com/haskell/cabal/issues/8336 (and known to cause problems on 9.4), so I'm going to disable the test altogether for GHC >= 9.2. This is most probably a GHA/GHC/packaging bug, not anything to do with cabal.

Mikolaj avatar Aug 03 '22 14:08 Mikolaj

@Mikolaj following your link, it's "Unexpected OK" now instead of "Unexpected FAIL". And that's no wonder because of the test itself:

  missesProfilingLinux <- isGhcVersion ">= 9.0.2"
...
  missesProfilingOsx <- isGhcVersion ">= 8.10.7"
  expectBrokenIf (linux && missesProfilingLinux || osx && missesProfilingOsx) 8032 $
...

This is not true that all GHCs above those are missing profiling libs so we expectBroken when we shouldn't.

The change I discussed above

-  missesProfilingLinux <- isGhcVersion ">= 9.0.2"
+  missesProfilingLinux <- isGhcVersion "== 9.0.2"
...
-  missesProfilingOsx <- isGhcVersion ">= 8.10.7"
+  missesProfilingOsx <- isGhcVersion "== 8.10.7"

was never implemented, but it could just fixed it: we only need to list GHCs with missing profile libs.

ulysses4ever avatar Aug 03 '22 15:08 ulysses4ever

Right, but it's a heisenbug. It sometimes passes, sometimes fails with the same GHC (9.2.3, but other 9.2.* are likely, too, and 9.4 is possible as well --- I don't think it's worthwhile to create an exhaustive list of GHCs currently broken by GHA (or whatever the underyling cause may be)).

Mikolaj avatar Aug 03 '22 15:08 Mikolaj

If it's nondeterministic, then yes. It's just the current code perfectly matches the error you referenced. If there are other failures, then I don't see a better solution. I'd still change those >= to something closer to reality though.

ulysses4ever avatar Aug 03 '22 15:08 ulysses4ever

Again: Use ghcup directly in your github workflow, not the haskell setup action, because it reuses existing GHCs, which can be random bindists.

hasufell avatar Aug 03 '22 15:08 hasufell

hmm I would bet it uses some bindist deterministically for bad or good (yeah sometimes for good, like when it installed a fixed downstream bindist from chocolatey)

jneira avatar Aug 03 '22 19:08 jneira

@jneira https://github.com/haskell/actions/tree/main/setup says:

The GitHub runners come with pre-installed versions of GHC and Cabal. Those will be used whenever possible. For all other versions, this action utilizes ppa:hvr/ghc, ghcup, and chocolatey.

This doesn’t strike me as a very deterministic (so to speak) algorithm. E.g. the pre-installed versions may probably change over time.


@hasufell do you have a good example of a purely ghcup-based setup in mind? I guess, some system-dependent boilerplate will be required?

ulysses4ever avatar Aug 04 '22 01:08 ulysses4ever

@jneira https://github.com/haskell/actions/tree/main/setup says:

The GitHub runners come with pre-installed versions of GHC and Cabal. Those will be used whenever possible. For all other versions, this action utilizes ppa:hvr/ghc, ghcup, and chocolatey.

This doesn’t strike me as a very deterministic (so to speak) algorithm. E.g. the pre-installed versions may probably change over time.


@hasufell do you have a good example of a purely ghcup-based setup in mind? I guess, some system-dependent boilerplate will be required?

https://www.haskell.org/ghcup/guide/#continuous-integration

https://github.com/hasufell/stack2cabal/blob/master/.github/workflows/haskell.yml

https://github.com/haskell/unix/blob/a4c6a0c0a7477dfe12727c2a58f143e9f6bbf22e/.github/workflows/ci.yml#L64

hasufell avatar Aug 04 '22 01:08 hasufell

https://github.com/hasufell/stack2cabal/blob/master/.github/workflows/haskell.yml

Seems to use haskell/actions/setup.

https://github.com/haskell/unix/blob/a4c6a0c0a7477dfe12727c2a58f143e9f6bbf22e/.github/workflows/ci.yml#L64

That’s nice! Doesn’t have caching of ghcup and everything that it pulls, though.

ulysses4ever avatar Aug 04 '22 01:08 ulysses4ever

https://github.com/hasufell/stack2cabal/blob/master/.github/workflows/haskell.yml

Seems to use haskell/actions/setup.

https://github.com/haskell/unix/blob/a4c6a0c0a7477dfe12727c2a58f143e9f6bbf22e/.github/workflows/ci.yml#L64

That’s nice! Doesn’t have caching of ghcup and everything that it pulls, though.

You don't want caching of bindists.

hasufell avatar Aug 04 '22 01:08 hasufell

You don't want caching of bindists.

Why though?

ulysses4ever avatar Aug 04 '22 02:08 ulysses4ever

You don't want caching of bindists.

Why though?

Causes issues if cache is broken or bindists are fixed in-place (we don't do that usually though).

The failure mode is: if caching is enabled and the bindist exists in the cache, use that. If the hash doesn't match, fail and do nothing.

Then you get those heisenbugs.

hasufell avatar Aug 04 '22 02:08 hasufell

I'd think that if cache invalidation is done correctly, no problem should arise. But then again: caching is the other hard problem in computer science?..

ulysses4ever avatar Aug 04 '22 03:08 ulysses4ever