easybuild-framework
easybuild-framework copied to clipboard
Rbundle build extremely slow even with --skip (655 exts ~27min)
I am building a 655 extension R bundle but it is taking ~27 minutes to run --skip --rebuild even though the only thing needed is to rerun sanity checks:
ran effectively:
eb R-bundle-community-3.6.3-foss-2019a-R-3.6.3.eb # fails after 655 modules
eb R-bundle-community-3.6.3-foss-2019a-R-3.6.3.eb --module-only
eb R-bundle-community-3.6.3-foss-2019a-R-3.6.3.eb --skip --rebuild
the --skip --rebuild stage taking 27 minutes
grep -i class easybuild-R-bundle-community-3.6.3-20200922.163759.kaFbB.log | grep -i instance
== 2020-09-22 16:39:48,950 easyconfig.py:1790 INFO Successfully obtained EB_Rmpi class instance from easybuild.easyblocks.rmpi
== 2020-09-22 17:02:09,211 easyconfig.py:1790 INFO Successfully obtained EB_Rserve class instance from easybuild.easyblocks.rserve
Finally got a full extension build through:
<snip 652 packages>
== skipping extension argparse
== skipping extension intergraph
== skipping extension ggnetwork
== restore after iterating...
== postprocessing [skipped]
== sanity checking...
== cleaning up...
== creating module...
== comparing module file with backup /opt/scp/unsupported/scpops/SCPAPP-2383_R-3.6.3-community-bundle/modules/all/R-bundle-community/3.6.3-foss-2019a-R-3.6.3.bak_20200922171610_19876; no differences found
== permissions...
== packaging...
== COMPLETED: Installation ended successfully (took 32 min 23 sec)
== Results of the build can be found in the log file(s) /opt/scp/unsupported/scpops/SCPAPP-2383_R-3.6.3-community-bundle/software/R-bundle-community/3.6.3-foss-2019a-R-3.6.3/easybuild/easybuild-R-bundle-community-3.6.3-20200922.174820.log
== Build succeeded for 1 out of 1
== Temporary log file(s) /tmp/eb-jnCchQ/easybuild-zWyXFD.log* have been removed.
== Temporary directory /tmp/eb-jnCchQ has been removed.
real 32m37.794s
user 29m50.290s
sys 1m50.925s
@BenjaminHCCarr This issue should be largely fixed with the changes in #3498.
There's more room for improvement, but with less performance benefit (and requiring significantly more work to implement)...
Update on this: there's some work-in-progress on support for installing extensions in parallel, especially R packages: see https://github.com/easybuilders/easybuild-framework/pull/3667 and https://github.com/easybuilders/easybuild-easyblocks/pull/2408 .
It's already functional, but it needs a little bit more love before we can include this in an EasyBuild release (perhaps we should make it an experimental feature to encourage testing).
Initial experimental support for installing extensions in parallel has been merged in https://github.com/easybuilders/easybuild-framework/pull/3667, and the RPackage easyblock has been updated accordingly in https://github.com/easybuilders/easybuild-easyblocks/pull/2408 such that installing R extensions can be done in parallel with the upcoming EasyBuild v4.5.0.
There are a couple of caveats still though:
- opt-in (for now) via
--parallel-extensions-install --experimental - requirements for installing extensions in parallel:
- must be able to determine list of required dependencies (via
required_depsproperty method in easyblock) - must be able to start installation asynchronously via
run_async+ check for completion viaasync_cmd_check - required extensions must be included in
exts_list; extensions provided via dependencies are not taken into account yet;
- must be able to determine list of required dependencies (via
- this works well with R easyconfigs: installation is several hours faster with enough cores available;
- known limitations:
- doesn't work yet for
R-bundle-Bioconductor(which depends onR), because all required dependencies must be listed in exts_list - skipping of installed extensions and sanity check for extensions is still done sequentially (but this can be changed too with limited additional effort);
- doesn't work yet for