gap icon indicating copy to clipboard operation
gap copied to clipboard

Clean up ARCH_IS_* functionality

Open ChrisJefferson opened this issue 1 year ago • 8 comments

This is just a placeholder issue, mainly so I don't forget when I come back from holiday.

Some hacks I applied when trying to clean up cygwin should be documented / polished before 4.12.

A brief history (from memory, going to go and explore to clear this up).

We have "ARCH_IS_UNIX" and "ARCH_IS_WINDOWS". In practice on Windows we only support cygwin, which is itself "UNIX-like".

We used to package a bare minimum of cygwin, which meant many standard "UNIX things" (like having 'sh' in particular), didn't work. This mean we special cased running external programs in Cygwin using cmd.exe.

In general, packages which had external executables didn't work in Cygwin, unless they had special code added to them which correctly set up windows-like paths to pass to cmd.

The new cygwin plan is to package a full minimal cygwin subsystem. Once we do this, GAP (mostly) requires no special code for windows, except for setting up the home directory (which is done before GAP starts), and dealing with windows line endings which users may still have in input files.

The problem which occurs is that all the packages with "ARCH_IS_WINDOWS" code actually get broken by doing this -- as they are trying to special case windows, when they would just work if they used their normal "UNIX" code path.

As an initial "hack" (which it turns out was broken in CI), I set up "ARCH_IS_WINDOWS" to mean "we do not have a standard UNIX install available".

This is technically described in the documentation for ARCH_IS_WINDOWS:

##  tests whether ⪆ is running on a Windows system without
##  standard POSIX tools available (such as a shell).

However, this is quite subtle -- in an ideal world we'd remove ARCH_IS_WINDOWS code from basically all packages -- the problem with that is then those packages wouldn't work in 4.11 or earlier.

ChrisJefferson avatar Jul 20 '22 08:07 ChrisJefferson

Bonus comment: I quickly tried the test that was hanging on my local copy of cygwin GAP, which (a) reports "ARCH_IS_UNIX", and (b) passes all tests. So this problem could be (gulp) something to do with github actions (or something else more subtle, but certainly not something I'm looking at now).

ChrisJefferson avatar Jul 20 '22 08:07 ChrisJefferson

Probably need to tmate into the GH actions job, and run the GAP tests with geb attached to figure out where it hangs... assuming gdb or something similar is available in Cygwin...?

fingolfin avatar Jul 20 '22 15:07 fingolfin

Unfortunately, I've tried this before and it doesn't work, as the shell you get from tmate is a cygwin one, but not ours, and you can't run programs from one cygwin install in another cygwin, due to DLL mess.

Last time I ended up just writing the action to run specific gap commands and observe the output :(

ChrisJefferson avatar Jul 20 '22 20:07 ChrisJefferson

Had a quick investigation of the issue. The problem is that, inside GitHub actions, running "cat" is failing. Obviously this shouldn't be happening! I will investigate further, I suspect this is due to two different "cat" ending up in the path somehow.

I will also try tweaking the test to at least not hang when 'cat' doesn't start correctly.

ChrisJefferson avatar Jul 27 '22 08:07 ChrisJefferson

That failing cat due to there being multiple cat binaries being around sounds quite plausible. Perhaps we should not rely on launching here on launching any external binaries at all, and instead try something like starting gap itself -- say gap --version or gap --help to make it exit quickly and still give us some output we can check. And instead of saying just gap it should use the full path to the GAP executable?

fingolfin avatar Aug 02 '22 08:08 fingolfin

On further investigation, it seems that Cygwin GAP on github actions can't run any external program with InputOutputLocalProcess, including another copy of gap. I now think the problem is probably that github actions has an incompatible other version of cygwin installed somewhere, and some DLL upset is happening.

I unfortunately I have no way to fix that, so I think I'll just have to disable this test. I'd prefer to just disable it on github actions (so it will hopefully occasionally get run outside of github actions and we'll notice if something breaks), but that would require a moderately horrible hack (like rming it if we are on windows).

ChrisJefferson avatar Aug 02 '22 20:08 ChrisJefferson

@ChrisJefferson do you still intend to work on this in the near future? Otherwise I'd drop it from the milestones.

fingolfin avatar Jan 12 '24 00:01 fingolfin

It doesn't need to be in the milestones

ChrisJefferson avatar Jan 31 '24 09:01 ChrisJefferson