easybuild-framework icon indicating copy to clipboard operation
easybuild-framework copied to clipboard

add support for easystack file that contains easyconfig filenames + implement parsing of configuration options

Open boegel opened this issue 3 years ago • 4 comments

boegel avatar Jun 07 '22 12:06 boegel

Let me see if I get this right: essentially, you want to support an additional yaml structure where we don't specify the packages to be installed using e.g.

# easystack.yaml
software:
  R-bundle-Bioconductor:
    toolchains:
      foss-2020a:
        versions:
          '3.11':
            versionsuffix: '-R-4.0.0'
  GROMACS:
    toolchains:
      foss-2020a:
        versions:
          '2020.1':
            versionsuffix: '-Python-3.8.2'
          '2020.4':
            versionsuffix: '-Python-3.8.2'

You would specify

# easystack.yaml
easyconfigs:
  - R-bundle-Bioconductor-3.11-foss-2020a-R-4.0.0
  - GROMACS-2020.1-foss-2020a-Python-3.8.2
  - GROMACS-2020.4-foss-2020a-Python-3.8.2

Correct?

Some thoughts:

It's a lot more compact, but one of the main reasons we opted for the first (hierarchical) structure, is that you probably want to add certain options at the software level (e.g. for all GROMACS installations, use some --include-easyblocks-from-pr or something). In the second example, you'd have to specify that twice, remove two items if the easyblock gets merged, etc. I'm fine in adding this as a second format, I mean, we're still able to use the first format if whenever that is more suitable. However if we add a 3rd, 4th and 5th later on, things will get quite confusing. So I'm curious: what was your main reason for wanting this format?

Another question that came to mind: how will we specify options here? I'm no YAML guru and don't know what's allowed/possible here, but can we do e.g.

# easystack.yaml
easyconfigs:
  - R-bundle-Bioconductor-3.11-foss-2020a-R-4.0.0
  - GROMACS-2020.1-foss-2020a-Python-3.8.2:
    from_pr: 1234
  - GROMACS-2020.4-foss-2020a-Python-3.8.2

or something? Or do you then envision something more like:

# easystack.yaml
easyconfigs:
  - R-bundle-Bioconductor-3.11-foss-2020a-R-4.0.0
  - GROMACS-2020.1-foss-2020a-Python-3.8.2 --from-pr 1234
  - GROMACS-2020.4-foss-2020a-Python-3.8.2

The second is easier to write, but harder to parse

casparvl avatar Jun 08 '22 09:06 casparvl

Just did a local test with

#easystack.yaml
robot: true
easyconfigs:
  - binutils-2.25-GCCcore-4.9.3
  - foss-2018a
  - R-bundle-Bioconductor-3.11-foss-2020a-R-4.0.0
  - GROMACS-2020.4-foss-2020a-Python-3.8.2

Running

eb --easystack easystack.yaml -D

That works fine. Mixing easyconfigs and software keywords silently ignores anything under software though:

$ cat easystack_format_mixed.yaml
robot: true
easyconfigs:
  - binutils-2.25-GCCcore-4.9.3
  - foss-2018a
  - R-bundle-Bioconductor-3.11-foss-2020a-R-4.0.0
  - GROMACS-2020.4-foss-2020a-Python-3.8.2
software:
  OpenFOAM:
    toolchains:
      foss-2020a:
        versions: ['8', 'v2006']
  R:
    toolchains:
      foss-2020a:
        versions: ['4.0.0']
$ eb --easystack easystack_format_mixed.yaml -D | grep FOAM
$ eb --easystack easystack_format_mixed.yaml -D | grep GROMACS
 * [ ] /home/casparl/.local/easybuild/Debian10/2021/software/EasyBuild/4.5.4-dev-boegel-easystack_easyconfigs/easybuild/easyconfigs/g/GROMACS/GROMACS-2020.4-foss-2020a-Python-3.8.2.eb (module: GROMACS/2020.4-foss-2020a-Python-3.8.2)

casparvl avatar Jun 08 '22 10:06 casparvl

@casparvl

So I'm curious: what was your main reason for wanting this format?

Mainly because the current format is overly verbose for the common case where there's nothing special going on (no specific configuration options needed), etc. I can probably rework this PR so it works with a "mixed" easystack, that uses both easyconfigs and software as top-level keys.

Another question that came to mind: how will we specify options here?

Good question, I didn't think that through yet. I think something like this could work:

easyconfigs:
    example-1.2.3:
        options: --from-pr 123
    example-4.5.6

I'll see if I can add dummy support for options, so we pick up on it, but don't do anything with it yet...

boegel avatar Jun 13 '22 09:06 boegel

I've implemented the dummy support for option passing, will push in a second. The support that I've added is that the parser can now understand this format:

robot: true
easyconfigs:
  - pkgconf-1.8.0-GCCcore-11.3.0.eb:
      options: {
        'debug': True,
      }
  - ncurses-6.3-GCCcore-11.3.0.eb

i.e. easyconfigs is a list, but the items maybe be simple strings (and EasyConfig name), but may also be dictionaries. In case of a dictionary, it has this (raw) structure:

{'pkgconf-1.8.0-GCCcore-11.3.0.eb': None, 'options': {'debug': True}}

We can later quite easily have the parser store that in a data structure that satisfies the format proposed here for opts_per_ec: i.e. a dictionary, in which the keys are the easyconfig names, and the values are dictionaries with all the options.

casparvl avatar Jul 29 '22 15:07 casparvl

Documentation will be updated to mention support for top-level easyconfigs key when #4057 is also merged (which adds support for actually applying the configuration options specific to a particular easystack entry)

boegel avatar Oct 12 '22 11:10 boegel