easybuild-framework
easybuild-framework copied to clipboard
WIP: 20200602 dependency graph layering
This is what has been briefly discussed on the EasyBuld #containers slack channel last week.
As noted in the documentation at https://easybuild.readthedocs.io/en/latest/Containers.html#stacking-container-images, creating 'stacked' containers avoids redundant builds. For my own purposes (I wanted to migrate a workflow involving quite a few scientific software packages from an EasyBuild-maintained site to another that allows running Singularity containers), I put together some graph partitioning and layering functionality that arranges the dependencies of a certain set of target easyconfigs in layers in order to facilitate automatic creation of stacked container images. Those containers avoid as many redundant builds as possible and aim at being reusable for further container builds as well. The attached hoermann_eb_dep_graph_partitioning_and_layering.pdf illustrates without too many words what this code does.
Here a few more comments.
- I use the
pygraph
package (https://github.com/Shoobx/python-graph) with it's last commit 2018 as it's already been in there, thus no other dependencies added. It serves its purpose well, but other packages might be able to take over some of the work done manually here (such as the transitive reduction, implemented following https://github.com/networkx/networkx/blob/9aedc31d291ac11eb0bb374c1ce8ad5cbcce02d3/networkx/algorithms/dag.py#L581) - I am not a graph theory specialist and the partitioning strategy implemented in
dep_graph_partition
is far from being based on any profound publication. It's just something made up from scratch that I found would handle more than two targets in some not-too-dumb way, but it might be wise to exchange that for some established strategy at some point. - For the same reason, there might be inefficient and redundant operations in there. Especially, the (partial) graph copying loops.
- No special treatment for build dependencies.
- If loglevel is DEBUG, then .dot files are written for (sub-) graphs that arise at different points.
In short, a call
eb MUMPS-5.2.1-foss-2020a-metis.eb GROMACS-2020.1-foss-2020a-Python-3.8.2.eb LAMMPS-3Mar2020-foss-2020a-Python-3.8.2-kokkos.eb --dep-graph-layers -r --terse
will result in such output
M4-1.4.18.eb
Bison-3.3.2.eb help2man-1.47.4.eb
Bison-3.5.3.eb flex-2.6.4.eb zlib-1.2.11.eb
binutils-2.34.eb
GCCcore-9.3.0.eb
M4-1.4.18-GCCcore-9.3.0.eb
Bison-3.5.3-GCCcore-9.3.0.eb help2man-1.47.12-GCCcore-9.3.0.eb
flex-2.6.4-GCCcore-9.3.0.eb zlib-1.2.11-GCCcore-9.3.0.eb
binutils-2.34-GCCcore-9.3.0.eb
ncurses-6.2-GCCcore-9.3.0.eb
expat-2.2.9-GCCcore-9.3.0.eb libreadline-8.0-GCCcore-9.3.0.eb
Perl-5.30.2-GCCcore-9.3.0.eb
Autoconf-2.69-GCCcore-9.3.0.eb
Automake-1.16.1-GCCcore-9.3.0.eb libtool-2.4.6-GCCcore-9.3.0.eb ncurses-6.1.eb
Autotools-20180311-GCCcore-9.3.0.eb gettext-0.20.1.eb
xorg-macros-1.19.2-GCCcore-9.3.0.eb XZ-5.2.5-GCCcore-9.3.0.eb
libpciaccess-0.16-GCCcore-9.3.0.eb libxml2-2.9.10-GCCcore-9.3.0.eb numactl-2.0.13-GCCcore-9.3.0.eb pkg-config-0.29.2-GCCcore-9.3.0.eb
GCC-9.3.0.eb hwloc-2.2.0-GCCcore-9.3.0.eb UCX-1.8.0-GCCcore-9.3.0.eb
OpenMPI-4.0.3-GCC-9.3.0.eb
gompi-2020a.eb OpenBLAS-0.3.9-GCC-9.3.0.eb
bzip2-1.0.8-GCCcore-9.3.0.eb cURL-7.69.1-GCCcore-9.3.0.eb FFTW-3.3.8-gompi-2020a.eb ScaLAPACK-2.1.0-gompi-2020a.eb
CMake-3.16.4-GCCcore-9.3.0.eb foss-2020a.eb
METIS-5.1.0-GCCcore-9.3.0.eb SCOTCH-6.0.9-gompi-2020a.eb
MUMPS-5.2.1-foss-2020a-metis.eb
M4-1.4.18.eb
Bison-3.3.2.eb help2man-1.47.4.eb
Bison-3.5.3.eb flex-2.6.4.eb zlib-1.2.11.eb
binutils-2.34.eb
GCCcore-9.3.0.eb
M4-1.4.18-GCCcore-9.3.0.eb
Bison-3.5.3-GCCcore-9.3.0.eb help2man-1.47.12-GCCcore-9.3.0.eb
flex-2.6.4-GCCcore-9.3.0.eb zlib-1.2.11-GCCcore-9.3.0.eb
binutils-2.34-GCCcore-9.3.0.eb
ncurses-6.2-GCCcore-9.3.0.eb
expat-2.2.9-GCCcore-9.3.0.eb libreadline-8.0-GCCcore-9.3.0.eb
Perl-5.30.2-GCCcore-9.3.0.eb
Autoconf-2.69-GCCcore-9.3.0.eb
Automake-1.16.1-GCCcore-9.3.0.eb libtool-2.4.6-GCCcore-9.3.0.eb ncurses-6.1.eb
Autotools-20180311-GCCcore-9.3.0.eb gettext-0.20.1.eb
xorg-macros-1.19.2-GCCcore-9.3.0.eb XZ-5.2.5-GCCcore-9.3.0.eb
libpciaccess-0.16-GCCcore-9.3.0.eb libxml2-2.9.10-GCCcore-9.3.0.eb numactl-2.0.13-GCCcore-9.3.0.eb pkg-config-0.29.2-GCCcore-9.3.0.eb
GCC-9.3.0.eb hwloc-2.2.0-GCCcore-9.3.0.eb UCX-1.8.0-GCCcore-9.3.0.eb
OpenMPI-4.0.3-GCC-9.3.0.eb
gompi-2020a.eb OpenBLAS-0.3.9-GCC-9.3.0.eb
bzip2-1.0.8-GCCcore-9.3.0.eb cURL-7.69.1-GCCcore-9.3.0.eb FFTW-3.3.8-gompi-2020a.eb ScaLAPACK-2.1.0-gompi-2020a.eb
CMake-3.16.4-GCCcore-9.3.0.eb foss-2020a.eb
Tcl-8.6.10-GCCcore-9.3.0.eb
GMP-6.2.0-GCCcore-9.3.0.eb libffi-3.3-GCCcore-9.3.0.eb SQLite-3.31.1-GCCcore-9.3.0.eb
Eigen-3.3.7-GCCcore-9.3.0.eb Python-3.8.2-GCCcore-9.3.0.eb
pybind11-2.4.3-GCCcore-9.3.0-Python-3.8.2.eb
SciPy-bundle-2020.03-foss-2020a-Python-3.8.2.eb
networkx-2.4-foss-2020a-Python-3.8.2.eb scikit-build-0.10.0-foss-2020a-Python-3.8.2.eb
GROMACS-2020.1-foss-2020a-Python-3.8.2.eb
M4-1.4.18.eb
Bison-3.3.2.eb help2man-1.47.4.eb
Bison-3.5.3.eb flex-2.6.4.eb zlib-1.2.11.eb
binutils-2.34.eb
GCCcore-9.3.0.eb
M4-1.4.18-GCCcore-9.3.0.eb
Bison-3.5.3-GCCcore-9.3.0.eb help2man-1.47.12-GCCcore-9.3.0.eb
flex-2.6.4-GCCcore-9.3.0.eb zlib-1.2.11-GCCcore-9.3.0.eb
binutils-2.34-GCCcore-9.3.0.eb
ncurses-6.2-GCCcore-9.3.0.eb
expat-2.2.9-GCCcore-9.3.0.eb libreadline-8.0-GCCcore-9.3.0.eb
Perl-5.30.2-GCCcore-9.3.0.eb
Autoconf-2.69-GCCcore-9.3.0.eb
Automake-1.16.1-GCCcore-9.3.0.eb libtool-2.4.6-GCCcore-9.3.0.eb ncurses-6.1.eb
Autotools-20180311-GCCcore-9.3.0.eb gettext-0.20.1.eb
xorg-macros-1.19.2-GCCcore-9.3.0.eb XZ-5.2.5-GCCcore-9.3.0.eb
libpciaccess-0.16-GCCcore-9.3.0.eb libxml2-2.9.10-GCCcore-9.3.0.eb numactl-2.0.13-GCCcore-9.3.0.eb pkg-config-0.29.2-GCCcore-9.3.0.eb
GCC-9.3.0.eb hwloc-2.2.0-GCCcore-9.3.0.eb UCX-1.8.0-GCCcore-9.3.0.eb
OpenMPI-4.0.3-GCC-9.3.0.eb
gompi-2020a.eb OpenBLAS-0.3.9-GCC-9.3.0.eb
bzip2-1.0.8-GCCcore-9.3.0.eb cURL-7.69.1-GCCcore-9.3.0.eb FFTW-3.3.8-gompi-2020a.eb ScaLAPACK-2.1.0-gompi-2020a.eb
CMake-3.16.4-GCCcore-9.3.0.eb foss-2020a.eb
Tcl-8.6.10-GCCcore-9.3.0.eb
GMP-6.2.0-GCCcore-9.3.0.eb libffi-3.3-GCCcore-9.3.0.eb SQLite-3.31.1-GCCcore-9.3.0.eb
Eigen-3.3.7-GCCcore-9.3.0.eb Python-3.8.2-GCCcore-9.3.0.eb
pybind11-2.4.3-GCCcore-9.3.0-Python-3.8.2.eb
SciPy-bundle-2020.03-foss-2020a-Python-3.8.2.eb
libpng-1.6.37-GCCcore-9.3.0.eb
freetype-2.10.1-GCCcore-9.3.0.eb gperf-3.1-GCCcore-9.3.0.eb Ninja-1.10.0-GCCcore-9.3.0.eb util-linux-2.35-GCCcore-9.3.0.eb
fontconfig-2.13.92-GCCcore-9.3.0.eb gettext-0.20.1-GCCcore-9.3.0.eb intltool-0.51.0-GCCcore-9.3.0.eb Meson-0.53.2-GCCcore-9.3.0-Python-3.8.2.eb
X11-20200222-GCCcore-9.3.0.eb
gzip-1.10-GCCcore-9.3.0.eb lz4-1.9.2-GCCcore-9.3.0.eb Python-2.7.18-GCCcore-9.3.0.eb Tk-8.6.10-GCCcore-9.3.0.eb
gc-7.6.12-GCCcore-9.3.0.eb libdrm-2.4.100-GCCcore-9.3.0.eb libglvnd-1.2.0-GCCcore-9.3.0.eb libunistring-0.9.10-GCCcore-9.3.0.eb libunwind-1.3.1-GCCcore-9.3.0.eb LLVM-9.0.1-GCCcore-9.3.0.eb Mako-1.1.2-GCCcore-9.3.0.eb Szip-2.1.1-GCCcore-9.3.0.eb Tkinter-3.8.2-GCCcore-9.3.0.eb zstd-1.4.4-GCCcore-9.3.0.eb
Doxygen-1.8.17-GCCcore-9.3.0.eb Guile-1.8.8-GCCcore-9.3.0.eb HDF5-1.10.6-gompi-2020a.eb matplotlib-3.2.1-foss-2020a-Python-3.8.2.eb Mesa-20.0.2-GCCcore-9.3.0.eb NASM-2.14.02-GCCcore-9.3.0.eb pkgconfig-1.5.1-GCCcore-9.3.0-Python-3.8.2.eb Yasm-1.3.0-GCCcore-9.3.0.eb
Boost-1.72.0-gompi-2020a.eb FriBidi-1.0.9-GCCcore-9.3.0.eb GSL-2.6-GCC-9.3.0.eb h5py-2.10.0-foss-2020a-Python-3.8.2.eb LAME-3.100-GCCcore-9.3.0.eb libGLU-9.0.1-GCCcore-9.3.0.eb libmatheval-1.1.11-GCCcore-9.3.0.eb molmod-1.4.5-foss-2020a-Python-3.8.2.eb netCDF-4.7.4-gompi-2020a.eb x264-20191217-GCCcore-9.3.0.eb x265-3.3-GCCcore-9.3.0.eb
archspec-0.1.0-GCCcore-9.3.0-Python-3.8.2.eb FFmpeg-4.2.2-GCCcore-9.3.0.eb kim-api-2.1.3-foss-2020a.eb libjpeg-turbo-2.0.4-GCCcore-9.3.0.eb PCRE-8.44-GCCcore-9.3.0.eb PLUMED-2.6.0-foss-2020a-Python-3.8.2.eb ScaFaCoS-1.0.1-foss-2020a.eb tbb-2020.1-GCCcore-9.3.0.eb Voro++-0.4.6-GCCcore-9.3.0.eb VTK-8.2.0-foss-2020a-Python-3.8.2.eb yaff-1.6.0-foss-2020a-Python-3.8.2.eb
LAMMPS-3Mar2020-foss-2020a-Python-3.8.2-kokkos.eb
where each line represents one layer (from lower to upper) and each paragraph one target. Hopefully, as many lower layers agree for as many targets as possible.
This bash script builds stacked container images, as an example on what can be done with such layering:
#!/bin/bash
set -euo pipefail
bootstrap_image="shahzebmsiddiqui/default/easybuild:centos-7"
declare -A levels=([DEBUG]=0 [INFO]=1 [WARN]=2 [ERROR]=3)
LOG_LEVEL="WARN"
DRY_RUN=
log_msg() {
local log_priority=$1
local log_message=$2
#check if level exists
[[ ${levels[$log_priority]} ]] || return 1
#check if level is enough
if (( ${levels[$log_priority]} >= ${levels[$LOG_LEVEL]} )); then
echo "${log_priority} : ${log_message}"
fi
}
usage() {
echo -n "
Usage: $(basename "$0") [-dhnv] [--image BOOTSTRAP_IMAGE] [EASY_CONFIG [EASY_CONFIG [ ... ]]]
Build stacked container images in current working directory from BOOTSTRAP_IMAGE (default: ${bootstrap_image}).
Expects environment variable SINGULARITY_TMPDIR to be set.
"
}
function join_by { local IFS="$1"; shift; echo "$*"; }
args=$(getopt -n "$0" -l "help,verbose,debug,dry-run" -o "hvdn" -- "$@")
if [ $? != 0 ] ; then echo "Failed parsing options." >&2 ; exit 1 ; fi
eval set -- "$args"
while true; do
case "$1" in
-h | --help ) usage ; exit 0 ;;
--image) bootstrap_image=$2; shift; shift;;
-n | --dry-run ) DRY_RUN=true; shift;;
-v | --verbose ) LOG_LEVEL=INFO; shift ;;
-d | --debug ) LOG_LEVEL=DEBUG; shift ;;
-- ) shift; break ;;
* ) break ;;
esac
done
# positional arguments
EASY_CONFIGS=$@
mkdir -p "$(pwd)/sources"
mkdir -p /tmp/easybuild/
ln -sf "$(pwd)/sources" /tmp/easybuild/sources
# always cap concatenated recipe and image names at maximum file name length - reserved length
NAME_MAX=$(getconf NAME_MAX .)
RESERVED_LENGTH=32
MAX_NAME_LENGTH=$(( ${NAME_MAX} - ${RESERVED_LENGTH} ))
log_msg INFO "system max filename length: ${NAME_MAX}"
log_msg INFO "derived max name length: ${MAX_NAME_LENGTH}"
# print some informations
if (( ${levels[$LOG_LEVEL]} <= ${levels["INFO"]} )); then
for ec in ${EASY_CONFIGS[@]}; do
cmd="eb ${ec} -Dr"
log_msg INFO "exec: ${cmd}"
${cmd}
done
cmd="eb ${EASY_CONFIGS[@]} --dep-graph-layers -r --debug"
log_msg INFO "exec: ${cmd}"
${cmd}
fi
# even with 'terse', eb prints log lines prefixed with '=='
cmd="eb ${EASY_CONFIGS[@]} --dep-graph-layers -r --terse"
log_msg INFO "exec: ${cmd}"
${cmd} | grep -v '==' > eb_layer_lists.txt
previous_layer=
while IFS= read -r layer; do
if [ -n "${layer}" ]; then
IFS=' ' read -r -a ecs <<< "$layer"
log_msg INFO "layer: ${layer}"
ec_basenames=$(for ec in "${ecs[@]}"; do basename "$ec" ".eb"; done)
image_name="$(join_by _ ${ec_basenames[@]})"
log_msg INFO "full name: ${image_name}"
if (( ${#image_name} > ${MAX_NAME_LENGTH} )); then
image_name="${image_name:0:${MAX_NAME_LENGTH}}"
log_msg INFO "capped name: ${image_name}"
fi
image_file="${image_name}.sif"
log_msg INFO "image: ${image_file}"
if [ -f "${image_file}" ]; then
log_msg INFO "skipped: '${image_file}' exists already."
else
cmd="eb ${layer[@]} --fetch --sourcepath /tmp/easybuild/sources"
log_msg INFO "exec: ${cmd}"
if [ -z "${DRY_RUN}" ]; then ${cmd}; fi
# if previous layer empty, then we are at the beginning of the dependency chain, build new image
if [ -z "${previous_layer}" ]; then
cmd="eb -C --container-build-image ${ecs[@]} --containerpath $(pwd) \
--container-config bootstrap=library,from=${bootstrap_image},eb_args='-l' \
--experimental --force --container-image-name ${image_name} \
--container-image-format sif --container-tmpdir ${SINGULARITY_TMPDIR}"
log_msg INFO "exec: ${cmd}"
if [ -z "${DRY_RUN}" ]; then ${cmd}; fi
else
IFS=' ' read -r -a previous_ecs <<< "$previous_layer"
previous_ec_basenames=$(for ec in "${previous_ecs[@]}"; do basename "$ec" ".eb"; done)
previous_image_name="$(join_by _ ${previous_ec_basenames[@]})"
log_msg INFO "full previous name: ${previous_image_name}"
if (( ${#previous_image_name} > ${MAX_NAME_LENGTH} )); then
previous_image_name="${previous_image_name:0:${MAX_NAME_LENGTH}}"
log_msg INFO "capped previous name: ${previous_image_name}"
fi
previous_image_file="${previous_image_name}.sif"
log_msg INFO "previous image: ${previous_image_file}"
cmd="eb -C --container-build-image ${ecs[@]} --containerpath $(pwd) \
--container-config bootstrap=localimage,from=${previous_image_file},eb_args='-l' \
--experimental --force --container-image-name ${image_name} \
--container-image-format sif --container-tmpdir ${SINGULARITY_TMPDIR}"
log_msg INFO "exec: ${cmd}"
if [ -z "${DRY_RUN}" ]; then ${cmd}; fi
fi
fi
else
log_msg INFO "Reached target ${previous_layer}."
fi
previous_layer=${layer}
done < eb_layer_lists.txt
@jotelha I will try to take a look at this soon...but it is a big PR so it might take me a bit of time
@boegel before you actually add that to a release, I will probably have to rewrite the dep_graph_partition
function to be based on some "proper" partitioning algorithm, as mentioned above (and add some tests). I won't have time to work on that until the second half of August, thus @ocaisa please take your time with reviewing the current state.
@jotelha I think it makes sense to add tests first before we take a closer look at this.
Some refactoring may be needed after review, but having tests will help us to make sense of it I think (which doesn't mean it's complicated code or anything, I haven't taken a close look at it yet).
Maybe even setting up a call to discuss this makes sense, to tackle this efficiently?
Sure, I did not manage to come back to that pull request yet, but I hope I will manage to add some tests soon, and I think a call would make sense after that. I would try to get back to you with that say within two weeks.