gap icon indicating copy to clipboard operation
gap copied to clipboard

CI for creating Windows .exe for release tags is broken

Open fingolfin opened this issue 2 years ago • 2 comments

I tagged v4.12.0 and our CI workflow for creating .exe files failed (log) with this error:

tar: gap-4.12.0/pkg/ace/htm/CHAP007.htm: Cannot create symlink to 'CHAP00A.htm': No such file or directory
tar: gap-4.12.0/pkg/ace/htm/CHAP008.htm: Cannot create symlink to 'CHAP00B.htm': No such file or directory
tar: gap-4.12.0/pkg/ace/htm/CHAP009.htm: Cannot create symlink to 'CHAP00C.htm': No such file or directory
tar: Exiting with failure status due to previous errors
Error: Process completed with exit code 2.

Indeed, the latest ACE release has some symlinks here, to resolve issue #4430. When I made that change, I did not have in mind that Windows doesn't have true symlinks and might not like it.

We need to fix the immediate problem; but I also think there is a clear case that we have a long-term issue here, too: while we test the "release CI job" daily, the test was insufficient to catch this issue; because it occurred in one of the parts of the CI job that is really only executed for releases. Bad!

We need to resolve both.

Short term: I will cook up a workaround so the 4.12.0 release can be completed. I may just be evil and modify that gap-4.12.0 tarball by replacing (?) those symlinks

Mid term: A proper fix for the above issue needs to be found. Perhaps ACE should just not create a symlink there, but instead just copy the HTML file? Then again... why does it only fail when it extracts https://github.com/gap-system/gap/releases/download/v4.12.0/gap-4.12.0.tar.gz but does not fail when it extracts https://github.com/gap-system/PackageDistro/releases/download/latest/packages.tar.gz ?? My only theory is that the order in which the files occurs in the tarballs is different; in the "good" one the symlinks come after the real files, while in the "bad" one the symlinks come before, resulting in the observed error. However, this theory is backed by nothing other than my free interpretation of the error message, so it might be utter nonsense.

Long term: We need to enhance the CI test to find this earlier! To this end, I'd like to minimize the difference between when it runs on a release tag vs. the "usual" runs. One way would be to modify the offending step Download the appropriate GAP release tarball as follows: instead of downloading the release tarball, we download an artifact created by the previous job in the workflow. However, to conserve storage, we only upload those artifacts for scheduled runs; well, let's upload them also for release tags; and then modify the `Download the appropriate GAP release tarball`` to use these artifacts, but also run it on the scheduled runs... Of course I am also open for other ideas!

CC @wilfwilson @ChrisJefferson @FriedrichRober

fingolfin avatar Aug 18 '22 22:08 fingolfin

One mildly horrible way I have seen this fixed in the past to run that twice, like "tar ... || tar ...". That obviously is a bit horrible, but might be a reasonable short term fix.

There are some symlink configuration options we could try, but I would prefer not to do those at the moment, as it took a while to get the symlink in the installer to work correctly.

ChrisJefferson avatar Aug 18 '22 23:08 ChrisJefferson

I have replace gap-4.12.0.tar.gz and gap-4.12.0.tar.gz.sha256 with doctored versions (replace the symlinks by file copies). Re-running the CI, it now gets passed the error... Fingers crossed.

So that settles the "short term" issue, now we can think about proper fixes...

fingolfin avatar Aug 18 '22 23:08 fingolfin

I've also modified ace and made a new release, replacing the symlinks with copies.

fingolfin avatar Sep 29 '22 12:09 fingolfin