M2 icon indicating copy to clipboard operation
M2 copied to clipboard

Building with Nix - various problems with the autoconf build process

Open pdg137 opened this issue 1 year ago • 10 comments

Hello,

As a personal project, I've been trying to build Macaulay2 with the Nix build system. The advantage of a Nix build is that it's available on many platforms - for example it would automatically support almost every Linux distribution - and gives a well-defined, predictable result, independent of the state of the OS it's built on. It's also a kind of extremely dependable documentation for how to build a package, replacing the vague or incomplete instructions in INSTALL-type documents.

Look at the math packages already available in Nix to see how much they are missing M2!

I am not a Nix expert, don't know much about Macaulay2 either, and have not managed to get it all the way through the build process, but I think I've already uncovered some difficulties that might be worth addressing. Also, in case there are any NixOS users in the M2 community, maybe someone will be interested in using this as a starting point:

https://github.com/pdg137/macaulay2-build

Note that I am using the autoconf build process. Here's what I have run into:

  • Several dependencies are not checked by autoconf and will cause build failures much later than necessary: bison, perl, texinfo, time (the time command isn't even necessary; I just removed it from the build)
  • There are several places where the build process tries to run git commands, not recognizing that I have already checked out all the source code. This fails since a) I don't have network access at build time and b) I am not providing git as a build input.
  • One spot where a shell script autogen.sh is assumed to be executable; it's more reliable to run it with sh.
  • Failure to generate some version string when double quotes are returned by lsb_release.
  • Some mysterious problem running dlopen "libmpfr.so" (this is what I'm working on now - it fails to generate the documentation for openSharedLibrary). See my issue about it.

Most of these are addressed in a blunt way with my patch file. For example, I just removed any git stuff that was getting in the way of the build. But it would be great to see them fixed more cleanly here in the main repository.

An additional problem is this:

  • The documentation takes a really long time to build, maybe more than half of the build time.

Is there a way to build and install the M2 binary without the docs? Breaking it up this way would allow me to say I've successfully built an mostly-working M2 that might already be useful to others.

Related:

  • https://github.com/NixOS/nixpkgs/pull/251814
  • https://github.com/NixOS/nixpkgs/issues/186823

pdg137 avatar May 07 '24 06:05 pdg137

Very cool!

The mfpr issue sounds like a similar one we had building on ARM Macs, where Homebrew wasn't installing the library in a location that dlopen could find by default. See https://github.com/Macaulay2/M2/issues/2877#issuecomment-1593283953. The "fix" is pretty hacky, but maybe something similar would work:

https://github.com/Macaulay2/M2/blob/ec65028f1527076b663279b1311188caa9e22b67/M2/Macaulay2/packages/ForeignFunctions.m2#L558-L574

To avoid building the documentation, you run configure with the --disable-documentation option.

d-torrance avatar May 07 '24 10:05 d-torrance

Okay, I've added --disable-documentation and now it builds and installs what might be a mostly-working M2 system! Now is there a way to try building the documentation later as a separate install step?

I was hoping that the final install would magically fix dlopen, but that does not seem to be the case. I don't think I can do something exactly like the Homebrew fix since the location of libraries is stored within the binary itself using "RUNPATH", but this is beyond my level of understanding of C++. Based on strace, it does seem like libmpfr is found correctly when the program starts up.

If I try this:

strace ./result/bin/M2 --no-threads  --no-readline --stop -e 'importFrom_Core {"dlopen"}; dlopen "libhello.so"'

It ends without looking in all the normal paths:

readlink("/home/paul", 0x7ffcf5c8e4a0, 1023) = -1 EINVAL (Invalid argument)
readlink("/home/paul/.Macaulay2", 0x7ffcf5c8e4a0, 1023) = -1 EINVAL (Invalid argument)
readlink("/home/paul/.Macaulay2/init.m2", 0x7ffcf5c8e4a0, 1023) = -1 EINVAL (Invalid argument)
newfstatat(AT_FDCWD, "/home/paul/.Macaulay2/init.m2", {st_mode=S_IFREG|0644, st_size=1495, ...}, AT_SYMLINK_NOFOLLOW) = 0
openat(AT_FDCWD, "./result/bin/../lib/Macaulay2/lib/glibc-hwcaps/x86-64-v3/libhello.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "./result/bin/../lib/Macaulay2/lib/glibc-hwcaps/x86-64-v2/libhello.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "./result/bin/../lib/Macaulay2/lib/libhello.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "glibc-hwcaps/x86-64-v3/libhello.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "glibc-hwcaps/x86-64-v2/libhello.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "libhello.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/nix/store/1zy01hjzwvvia6h9dq5xar88v77fgh9x-glibc-2.38-44/lib/libhello.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/nix/store/1zy01hjzwvvia6h9dq5xar88v77fgh9x-glibc-2.38-44/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/nix/store/1zy01hjzwvvia6h9dq5xar88v77fgh9x-glibc-2.38-44/lib/libhello.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/nix/store/j6n6ky7pidajcc3aaisd5qpni1w1rmya-xgcc-12.3.0-libgcc/lib/glibc-hwcaps/x86-64-v3/libhello.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/nix/store/j6n6ky7pidajcc3aaisd5qpni1w1rmya-xgcc-12.3.0-libgcc/lib/glibc-hwcaps/x86-64-v3/", 0x7ffcf5c8ee10, 0) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/nix/store/j6n6ky7pidajcc3aaisd5qpni1w1rmya-xgcc-12.3.0-libgcc/lib/glibc-hwcaps/x86-64-v2/libhello.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/nix/store/j6n6ky7pidajcc3aaisd5qpni1w1rmya-xgcc-12.3.0-libgcc/lib/glibc-hwcaps/x86-64-v2/", 0x7ffcf5c8ee10, 0) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/nix/store/j6n6ky7pidajcc3aaisd5qpni1w1rmya-xgcc-12.3.0-libgcc/lib/libhello.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/nix/store/j6n6ky7pidajcc3aaisd5qpni1w1rmya-xgcc-12.3.0-libgcc/lib/", {st_mode=S_IFDIR|0555, st_size=4096, ...}, 0) = 0
alarm(0)                                = 0
getcwd("/home/paul/gitprojects/public/macaulay2-build", 1024) = 46
readlink("/home/paul/gitprojects/public/macaulay2-build/currentString", 0x7ffcf5c8e370, 1023) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, NULL, 0x7ffcf5c8e7c0, AT_SYMLINK_NOFOLLOW) = -1 EFAULT (Bad address)
getcwd("/home/paul/gitprojects/public/macaulay2-build", 699) = 46
getcwd("/home/paul/gitprojects/public/macaulay2-build", 699) = 46
getcwd("/home/paul/gitprojects/public/macaulay2-build", 699) = 46
write(2, "currentString:1:29:(3):[2]: erro"..., 106currentString:1:29:(3):[2]: error: libhello.so: cannot open shared object file: No such file or directory

pdg137 avatar May 07 '24 15:05 pdg137

installPackage is the function that builds the documentation for each package, so you could write a script that loops over the list of all the packages (separate(" ", version#"packages")) and runs it.

IIRC, Style needs to be installed first (it contains the images and CSS files used by the other packages), then FirstPackage, then Macaulay2Doc, and then after that the order doesn't matter.

d-torrance avatar May 07 '24 16:05 d-torrance

You could try using the CMake build which builds the documentation in parallel and allows you to set runtime path for dynamic libraries in the binary itself:

  1. https://cmake.org/cmake/help/latest/prop_tgt/BUILD_RPATH.html
  2. https://cmake.org/cmake/help/latest/prop_tgt/INSTALL_RPATH.html

mahrud avatar May 08 '24 22:05 mahrud

In the first note in my issue about dlopen I show that RUNPATH does include the path to the mpfr libraries. I suppose this is automatically done by the nix build tools, and everything else seems to work normally, but debugging this is beyond my C++ skill level. Would CMake somehow set it up better?

pdg137 avatar May 09 '24 06:05 pdg137

Meanwhile, my latest improvement is to start building some of the "downloads" separately, to make the build more modular and faster to develop:

https://github.com/pdg137/macaulay2-build/compare/master...testing

Unfortunately it seems to fail at runtime since M2 expects normaliz to be on the PATH:

$ ./result/bin/M2
Macaulay2, version 1.23
/nix/store/y2xhqfz2cci96qjdqnqdzapqknkmbhsg-M2/share/Macaulay2/Core/programs.m2:120:29:(1):[70]: error: could not find normaliz

Is this a problem with configure? I guess ideally it would record the path to the normaliz exectuable permanently when it finds it:

checking whether the package normaliz is installed... /nix/store/rn9w58rc0x622d0hdl2i86im7fbbr24y-normaliz/bin/normaliz
yes

But I'm not sure what is normal for this kind of thing.

pdg137 avatar May 09 '24 07:05 pdg137

The configure script works by running command -v normaliz, but Macaulay2 checks in various directories to see if normaliz exists and is executable before actually running anything on the shell.

If there's some nix tool that you can run to tell you what /nix/store/... directory contains normaliz, you could use it to add that directory to the programPaths hash table. Then M2 will know to check there. Something like:

programPaths#"normaliz" = "/nix/store/...

d-torrance avatar May 10 '24 14:05 d-torrance

Yes, we have access to all of the exact paths to dependencies even before they are built. That's at the heart of how Nix works! But isn't programPaths for the user of M2 to configure? The documentation there suggests that there are some better places to compile in system paths:

In particular, findProgram already checks prefixDirectory | currentLayout#"programs", (where the programs shipped with Macaulay2 are installed)

Shouldn't we be able to get normaliz into the list of places it "already checks"? I thought that kind of thing would be the job of the configure script. Or if there's a directory built to store those programs, I could symlink normaliz into it.

pdg137 avatar May 11 '24 18:05 pdg137

Linking into libexec/Macaulay2/bin/ seems to do it! I found that path in startup.m2.

https://github.com/pdg137/macaulay2-build/commit/4d91d5e376374d6e44490241a47f65ec69a654a8

It's kind of silly since there doesn't seem to be anything else in that directory (should I link some other programs?), but with the link there, M2 starts without errors.

pdg137 avatar May 11 '24 19:05 pdg137

I now have this build working without any of the "downloads"; everything is built from the official source or nixpkgs:

https://github.com/pdg137/macaulay2-build

So it's more modular and clear about what's going on. The final M2 part of the build takes about 17 mins on this Ryzen 5 desktop under WSL.

pdg137 avatar May 11 '24 22:05 pdg137