easybuild-framework
easybuild-framework copied to clipboard
add support for data installations
motivation
- leverage EB to install data in a standardized way with proper versioning and checksumming
- support adding datasets as dependency for software
- easily
swap
dataset versions withml swap
changes
- add cmd line option
--installpath-data
similar to--installpath-software
- add cmd line option
--subdir-data
(default =data
) similar to--subdir-software
- add cmd line option
--sourcepath-data
similar to--sourcepath
- add Easyconfig parameter
data_sources
similar tosources
design
- the main reason for a separate
subdir_data
is reusability: in contrast to software it does not have to be rebuilt/reinstalled when for example upgrading the OS or building for a new architecture - the reason for a separate
sourcepath_data
is that datasets can be very large, so you may want to store them in a different file system or location.