tools icon indicating copy to clipboard operation
tools copied to clipboard

POC: module yaml restructure

Open ewels opened this issue 1 year ago • 4 comments

Proof of concept showing minor change to the parsing of module processes, to correctly capture structure of inputs. Currently just dumps a hypothetical YAML string to the console when running linting, eg:

nf-core modules lint affy/justrma
input:
- - meta:
      type: val
  - samplesheet:
      type: path
  - celfiles_dir:
      type: path
- - meta2:
      type: val
  - description:
      type: path

Just a starting point. If we want to fix all module metas, needs a lot more work:

  • [ ] Change how meta.yml is built when creating a new module from the template
  • [ ] Update the linting code to expect this structure when checking existing modules
  • [ ] Do all of the above for output as well as input
  • [ ] Find and fix all the edge cases
  • [ ] Update the website documentation pages to understand this format
  • [ ] Update the developer docs
  • [ ] Check where else meta.yml is used and update accordingly
  • [ ] Somehow try to migrate all 1k modules to the new format, without losing custom data added by hand 😰
    • Note that even though the structure is different, the named keys should be the same. So hopefully can automate.

Related / would fix:

  • https://github.com/nf-core/tools/issues/1555
  • https://github.com/nf-core/tools/issues/1993

ewels avatar Feb 23 '24 08:02 ewels

Created an issue to track this in the modules repo: https://github.com/nf-core/modules/issues/4983

ewels avatar Feb 25 '24 18:02 ewels

Good timing! we need to discuss about this, meta.yml is becoming essential to unlock multiple critical features (types checking, output schema, generative workflows), I think we need to promote it as a nextflow "standard"

pditommaso avatar Feb 25 '24 18:02 pditommaso

Hello, I would like to take over this PR and want to make sure we are on the same page before starting.

I think we agreed on the new meta.yml structure, as suggested by @mashehu in https://github.com/nf-core/modules/issues/4983#issuecomment-1963572056

 input: 
   - #input tuple
       - meta: 
           type: map 
           description: | 
             Groovy Map containing sample information 
             e.g. [ id:'test', single_end:false ] 
       - scaffold: 
           type: file 
           description: Fasta file containing scaffold 
           pattern: "*.{fasta,fa}" 
   - fasta: 
       type: file 
       description: FASTA reference file 
       pattern: "*.{fasta,fa}" 

The next steps that should be done first are:

  • Update the modules template
  • Update nf-core modules lint
  • Create a command to update the meta.yml

To update the file, I see two options: either add the new nf-core modules update-meta-yml command (https://github.com/nf-core/tools/issues/1555) or add an argument --update-meta-yml to nf-core modules lint which would modify the file. This new command could be used to update all modules from nf-core/modules once we have it.

Does that seem good to you?

mirpedrol avatar May 31 '24 10:05 mirpedrol

@mirpedrol yes that sounds good.

For pipeline linting, we already have the flag --fix (eg. nf-core lint --fix).

Could we have the same approach here? nf-core modules lint --fix. Then it's more generic rather than having a bespoke CLI flag just for this. If I remember correctly, the option can take a string which corresponds to a name of a specific lint test to narrow it down if needed.

ewels avatar Jun 19 '24 06:06 ewels

🧹 spring cleaning message 🌷

Closing this PR as it was done in @3032

mirpedrol avatar Mar 11 '25 15:03 mirpedrol