easybuild-framework icon indicating copy to clipboard operation
easybuild-framework copied to clipboard

add support for installing in new namespace with bwrap

Open smoors opened this issue 3 years ago • 5 comments

when reinstalling software with EB on an active cluster, the software is unavailable or in a broken state during the reinstallation process, which might take a long time

here we propose a 2-step install procedure to keep the time that the software is unavailable as short as possible:

  1. install the software in a new namespace in a different location with bwrap (bubblewrap)
  2. copy the reinstalled software to the original location, as fast and as "atomically" as possible

here is an example how bwrap can be used:

bwrap --bind / / --bind /path/to/bwrap/software/name /path/to/software/name --bind /path/to/bwrap/module/name /path/to/module/name --dev /dev --bind /dev/log /dev/log

the separate binds for software and modules are necessary in case of a non-default installation directory tree. for example, in our site it is organized as follows:

--software
    --soft1_name
        --version1
--modules
    --toolchain_generation1
        --all
            --soft1_name
                --version1

here is a way to copy the installation directory that is not fully atomic, but close. this assumes that both the bwrap and the original directory are in the same file system:

rsync -av --delete-after --link-dest=/path/to/bwrap/dir /path/to/bwrap/dir /path/to/original/dir

with --link-dest=, files are not copied but hardlinks are created to the bwrap dir

smoors avatar Oct 25 '22 07:10 smoors

I've discussed this a bit with @smoors last week, and an idea popped up during the discussion which I want to note down here...

If we want to leverage bwrap to make EasyBuild into installing software in a particular path while the binaries & co are actually located in another path, then the whole EasyBuild session needs to be launched with bwrap, which creates a chicken-egg situation: we'll somehow need to configure EasyBuild to use bwrap, so the EasyBuild configuration needs to be parsed to know whether or not bwrap should be used.

To circumvent that problem, we could make a (somewhat intrusive) change to the eb wrapper: we could implement a small Python helper script that specifies how EasyBuild should be called. That information could then be used by the eb wrapper to actually start the EasyBuild session.

Without EasyBuild being configured to use bwrap, it would produce output like:

${PYTHON}" -m "${EASYBUILD_MAIN}"

If EasyBuild is configured to use bwrap, it would produce output like:

bwrap --bind / / --bind ... ${PYTHON}" -m "${EASYBUILD_MAIN}"

boegel avatar Apr 29 '25 06:04 boegel

Is bwrap guaranteed to work? Don't you need some kind of sanity check on that configuration to make sure it will? I ask because I was playing around with something similar last week and if user name spaces are disabled you will not be able to bind in / IIRC.

ocaisa avatar Apr 29 '25 07:04 ocaisa

in order for this to work reliably with --robot, we need a few extra things:

  • we need to bind mount the installdir of each software that we will install with bwrap. we thus need eb to first generate this list, which is essentially what eb --dry-run --robot does.
  • as the installdirs are bind mounted, we cannot remove them, so we need to avoid that eb removes them in all cases, see https://github.com/easybuilders/easybuild-framework/pull/4894

smoors avatar May 25 '25 11:05 smoors

The traditional way to do this is to use $DESTDIR, most distributions use this mechanism to create packages; Gentoo uses it by default without needing to create the package file; it uses a staging area before the final sync. I'd suspect the majority of easyconfigs support doing make install DESTDIR=xxx but of course there are exceptions. I've never tried EB's builtin rpm support but I suspect it doesn't use DESTDIR?

I've used bwrap myself inside an easyconfig for matlab, see https://github.com/ComputeCanada/easybuild-easyconfigs/blob/computecanada-main/easybuild/easyconfigs/m/MATLAB/MATLAB-2024b.1.eb, using a tmpfs, but that's just for the install step.

Note that bwrap 0.11 supports overlayfs which could simplify all this significantly: instead of

bwrap --bind / / --bind /path/to/bwrap/software/name /path/to/software/name --bind /path/to/bwrap/module/name /path/to/module/name --dev /dev --bind /dev/log /dev/log

one could use

bwrap --dev-bind / / --overlay-src /path/to --overlay /tmp/path/to/rwsrc /tmp/path/to/workdir /path/to eb ...

then EasyBuild will really install under /tmp/path/to/rwsrc and you can rsync from there once it's finished without needing the extra parsing. Note that parallel file systems may not support being an "RWSRC" for overlayfs so to use --link-dest= you'd have to use an intermediate copy between /tmp and /path/to.

bartoldeman avatar May 26 '25 13:05 bartoldeman

Note that bwrap 0.11 supports overlayfs which could simplify all this significantly: instead of

i saw that too, but how does this differ from plain overlayfs? i tested overlayfs (without bwrap), which kind of works, but there were several issues with it.

smoors avatar Jun 02 '25 08:06 smoors